Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_sbml_model does not substitute legacy SBML placeholders anymore #1149

Open
rtsantos3 opened this issue Feb 3, 2022 · 10 comments
Open
Labels
SBML Related to reading and writing SBML models.

Comments

@rtsantos3
Copy link

rtsantos3 commented Feb 3, 2022

Hi. I'm trying to convert an SBML 2 model and I've encountered a bug in the model.manipulation.modify module. Apparently there's an error in one of the functions which prevents the parser from replacing the escaped strings.

def _escape_str_id(id_str):
    """make a single string id SBML compliant"""
    for c in ("'", '"'):
        if id_str.startswith(c) and id_str.endswith(c) \
                and id_str.count(c) == 2:
            id_str = id_str.strip(c)
    for char, escaped_char in _renames:
        id_str = id_str.replace(char, escaped_char)
    return id_str

Interchanging the char & escaped_char fixes the problem with the function as well as others that use it (like escape_ID()).

def _escape_str_id(id_str):
    """make a single string id SBML compliant"""
    for c in ("'", '"'):
        if id_str.startswith(c) and id_str.endswith(c) \
                and id_str.count(c) == 2:
            id_str = id_str.strip(c)
    for char, escaped_char in _renames:
        id_str = id_str.replace(escaped_char, char)
    return id_str
@cdiener
Copy link
Member

cdiener commented Feb 3, 2022

Maybe I'm overlooking something but that change seems like it wouldn't work. char is supposed to be replaced with escaped_char in this function and that is what the old function was doing. The change would try to do the opposite which would not change the ID. Can you fill in the rest from the issue template so we can see what is going on?

@rtsantos3
Copy link
Author

rtsantos3 commented Feb 3, 2022

Yeah, sure. Each of the items in the _renames set is structured as [(symbol, esc_symbol)] so when the replace function tries to look for the first argument, what it tries to looks for in the id is the symbol instead.

Try this short code snippet:

_renames = (
    (".", "_DOT_"),
    ("(", "_LPAREN_"),
    (")", "_RPAREN_"))

id_to_fix = "EX_co2_LPAREN_e_RPAREN__BS"

def _escape_str_id_old(id_str):
    """make a single string id SBML compliant"""
    for c in ("'", '"'):
        if id_str.startswith(c) and id_str.endswith(c) \
                and id_str.count(c) == 2:
            id_str = id_str.strip(c)
    for char, escaped_char in _renames:
        id_str = id_str.replace(char, escaped_char)
    return id_str

def _escape_str_id_new(id_str):
    """make a single string id SBML compliant"""
    for c in ("'", '"'):
        if id_str.startswith(c) and id_str.endswith(c) \
                and id_str.count(c) == 2:
            id_str = id_str.strip(c)
    for char, escaped_char in _renames:
        id_str = id_str.replace(escaped_char, char)
    return id_str

print(escape_str_id_old(id_str))
print(escape_str_id_new(id_str))

The latter one should print correctly.

@cdiener
Copy link
Member

cdiener commented Feb 3, 2022

Yeah but that is not what the function is supposed to do. As the docstring indicates it is supposed to transform a non-SBML compliant string to an SBML-compliant one. EX_co2_LPAREN_e_RPAREN__BS is SBML-compliant and no replacements should be made. EX_co2(e) is not SBML-compliant and should be changed to EX_co2_LPAREN_e_RPAREN_ for instance. If it is used somewhere else in the codebase to do the opposite the bug would be at that position and not here. And the old one behaves correctly here:

In [2]: _escape_str_id_old("EX_co2(e)")
Out[2]: 'EX_co2_LPAREN_e_RPAREN_'

In [3]: _escape_str_id_new("EX_co2(e)")
Out[3]: 'EX_co2(e)'

@rtsantos3
Copy link
Author

rtsantos3 commented Feb 3, 2022

Oh, okay. Haha. My apologies. Thought it was the other way around (The docstring being the non-compliant format). Thank you for the clarification.

Seems I found a new use for this function -- it turns the SBML-compliant format to a more human-readable one. :)

@cdiener
Copy link
Member

cdiener commented Feb 3, 2022

Sure, no problem. Yep, but that is also done by default in the SBMl reader. See the top of the file here. So EX_co2_LPAREN_e_RPAREN_ should be read as EX_co2(e) in the cobra Model by default.

@cdiener cdiener closed this as completed Feb 3, 2022
@rtsantos3
Copy link
Author

Hmm. That's weird. It doesn't automatically change when I'm using read_sbml_file() though, which is why I looked for a parser with that specific function instead.

Should I open another ticket for this instead? Thanks a lot.

@cdiener cdiener reopened this Feb 3, 2022
@cdiener
Copy link
Member

cdiener commented Feb 3, 2022

Nope, can leave it here. Can you provide a code example (and preferably the model) where it does not substitute the ID?

@rtsantos3
Copy link
Author

Sure, no problem. Here's the model I've been using:
ios2164.zip

Here's a short code snippet from my notebook:

import cobra

ios2164 = cobra.io.read_sbml_model("../ios2164.xml")
for rxns in ios2164.exchanges:
    print(rxns.id)

@cdiener
Copy link
Member

cdiener commented Feb 3, 2022

Oh yeah sorry. It only substitutes IDs with the new pattern which is _[ASCII CODE]_ and not the legacy substitutions. This should probably be added into the F_REPLACE functions.

As a reference, here is the output of the code:

[...billions of warnings...]
EX_co2_LPAREN_e_RPAREN_
EX_h2o_LPAREN_e_RPAREN_
EX_h_LPAREN_e_RPAREN_
EX_no3_LPAREN_e_RPAREN_
EX_o2_LPAREN_e_RPAREN_
EX_pi_LPAREN_e_RPAREN_
EX_sucr_LPAREN_e_RPAREN_
EX_fru_DASH_B_LPAREN_e_RPAREN_
EX_glc_DASH_A_LPAREN_e_RPAREN_
EX_so4_LPAREN_e_RPAREN_
EX_hco3_LPAREN_e_RPAREN_
EX_so3_LPAREN_e_RPAREN_
EX_h2s_LPAREN_e_RPAREN_
EX_nh4_LPAREN_e_RPAREN_
EX_asn_DASH_L_LPAREN_e_RPAREN_
EX_gln_DASH_L_LPAREN_e_RPAREN_
EX_etoh_LPAREN_e_RPAREN_
EX_ac_LPAREN_e_RPAREN_
EX_lac_DASH_L_LPAREN_e_RPAREN_
EX_tsul_LPAREN_e_RPAREN_
EX_fe2_LPAREN_e_RPAREN_
EX_fe3_LPAREN_e_RPAREN_
EX_mg2_LPAREN_e_RPAREN_
EX_ala_DASH_L_LPAREN_e_RPAREN_
EX_arg_DASH_L_LPAREN_e_RPAREN_
EX_asp_DASH_L_LPAREN_e_RPAREN_
EX_cys_DASH_L_LPAREN_e_RPAREN_
EX_glu_DASH_L_LPAREN_e_RPAREN_
EX_gly_LPAREN_e_RPAREN_
EX_his_DASH_L_LPAREN_e_RPAREN_
EX_ile_DASH_L_LPAREN_e_RPAREN_
EX_leu_DASH_L_LPAREN_e_RPAREN_
EX_lys_DASH_L_LPAREN_e_RPAREN_
EX_met_DASH_L_LPAREN_e_RPAREN_
EX_phe_DASH_L_LPAREN_e_RPAREN_
EX_pro_DASH_L_LPAREN_e_RPAREN_
EX_ser_DASH_L_LPAREN_e_RPAREN_
EX_thr_DASH_L_LPAREN_e_RPAREN_
EX_trp_DASH_L_LPAREN_e_RPAREN_
EX_tyr_DASH_L_LPAREN_e_RPAREN_
EX_val_DASH_L_LPAREN_e_RPAREN_
EX_photonVis_LPAREN_e_RPAREN_

@cdiener cdiener changed the title Bug in Model.manipulation module read_sbml_model does not substitute legacy SBML placeholders anymore Feb 3, 2022
@cdiener cdiener added the SBML Related to reading and writing SBML models. label Feb 3, 2022
@rtsantos3
Copy link
Author

No problem. Either way I've found what I needed. Haha. Thanks a lot. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SBML Related to reading and writing SBML models.
Projects
None yet
Development

No branches or pull requests

2 participants