Reading and writing knowledge bases to/from files.
Supported file types:
Comma separated values (.csv)
Excel (.xlsx)
Tab separated values (.tsv)
- Author:
Jonathan Karr <karr@mssm.edu>
- Date:
2018-02-12
- Copyright:
2018, Karr Lab
- License:
MIT
3.2.5. Module Contents¶
3.2.5.1. Classes¶
Write knowledge base to file(s) |
|
Read knowledge base from file(s) |
3.2.5.2. Functions¶
|
Convert among Excel (.xlsx), comma separated (.csv), and tab separated (.tsv) file formats |
|
Create file with knowledge base template, including row and column headings |
3.2.5.3. Attributes¶
- class wc_kb.io.Writer[source]¶
Bases:
obj_tables.io.WriterWrite knowledge base to file(s)
- run(core_path, knowledge_base, seq_path=None, rewrite_seq_path=True, taxon='eukaryote', models=None, get_related=True, include_all_attributes=False, validate=True, title=None, description=None, keywords=None, version=None, language=None, creator=None, write_schema=False, write_toc=True, extra_entries=0, data_repo_metadata=False, schema_package=None, protected=True)[source]¶
Write knowledge base to file(s)
- Parameters:
knowledge_base (
core.KnowledgeBase) – knowledge basecore_path (
str) – path to save core knowledge baseseq_path (
str, optional) – path to save genome sequencerewrite_seq_path (
bool, optional) – ifTrue, the path to genome sequence in the saved knowledge base will be updated to the newly saved seq_pathtaxon (
str, optional) – type of model order to usemodels (
listofModel, optional) – models in the order that they should appear as worksheets; all models which are not in models will follow in alphabetical orderget_related (
bool, optional) – ifTrue, write object and all related objectsinclude_all_attributes (
bool, optional) – ifTrue, export all attributes including those not explictly included in Model.Meta.attribute_ordervalidate (
bool, optional) – ifTrue, validate the datatitle (
str, optional) – titledescription (
str, optional) – descriptionkeywords (
str, optional) – keywordsversion (
str, optional) – versionlanguage (
str, optional) – languagecreator (
str, optional) – creatorwrite_schema (
bool, optional) – ifTrue, include additional worksheet with schemawrite_toc (
bool, optional) – ifTrue, include additional worksheet with table of contentsextra_entries (
int, optional) – additional entries to displaydata_repo_metadata (
bool, optional) – ifTrue, try to write metadata information about the file’s Git repo; the repo must be current with origin, except for the fileschema_package (
str, optional) – the package which defines the obj_tables schema used by the file; if notNone, try to write metadata information about the the schema’s Git repository: the repo must be current with originprotected (
bool, optional) – ifTrue, protect the worksheet
- Raises:
ValueError – if any of the relationships with knowledge bases and cells are not set
- classmethod validate_implicit_relationships()[source]¶
Check that relationships to
core.KnowledgeBaseandcore.Celldo not need to be explicitly written to workbooks because they can be inferred byReader.run- Raises:
Exception – if the Excel serialization involves an unsupported implicit relationship
- validate_implicit_relationships_are_set(knowledge_base)[source]¶
Check that there is only 1
KnowledgeBaseand <= 1Celland that each relationship toKnowledgeBaseandCellis set. This is necessary to enable theKnowledgeBaseandCellrelationships to be implicit in the Excel output and added byReader.run- Parameters:
knowledge_base (
core.KnowledgeBase) – knowledge base- Raises:
ValueError – if there are multiple instances of
core.KnowledgeBasein the object graph
- class wc_kb.io.Reader[source]¶
Bases:
obj_tables.io.ReaderRead knowledge base from file(s)
- run(core_path, seq_path='', rewrite_seq_path=True, taxon='eukaryote', models=None, ignore_missing_models=None, ignore_extra_models=None, ignore_sheet_order=None, include_all_attributes=False, ignore_missing_attributes=None, ignore_extra_attributes=None, ignore_attribute_order=None, group_objects_by_model=True, validate=True, read_metadata=False)[source]¶
Read knowledge base from file(s)
- Parameters:
core_path (
str) – path to core knowledge baseseq_path (
str) – path to genome sequencerewrite_seq_path (
bool, optional) – ifTrue, the path to genome sequence in the knowledge base will be updated to the provided seq_pathtaxon (
str, optional) – type of model order to usemodels (
types.TypeTypeorlistoftypes.TypeType, optional) – type of object to read or list of types of objects to readignore_missing_models (
bool, optional) – ifFalse, report an error if a worksheet/ file is missing for one or more modelsignore_extra_models (
bool, optional) – ifTrueand all models are found, ignore other worksheets or filesignore_sheet_order (
bool, optional) – ifTrue, do not require the sheets to be provided in the canonical orderinclude_all_attributes (
bool, optional) – ifTrue, export all attributes including those not explictly included in Model.Meta.attribute_orderignore_missing_attributes (
bool, optional) – ifFalse, report an error if a worksheet/file doesn’t contain all of attributes in a model in modelsignore_extra_attributes (
bool, optional) – ifTrue, do not report errors if attributes in the data are not in the modelignore_attribute_order (
bool) – ifTrue, do not require the attributes to be provided in the canonical ordergroup_objects_by_model (
bool, optional) – ifTrue, group decoded objects by their typesvalidate (
bool, optional) – ifTrue, validate the dataread_metadata (
bool, optional) – ifTrue, read metadata models
- Returns:
model objects grouped by obj_tables.Model class
- Return type:
dict- Raises:
ValueError –
if
core_pathDefines multiple knowledge bases or cells
Represents objects that cannot be linked to a knowledge base and/or cell
- wc_kb.io.convert(source_core, source_seq, dest_core, dest_seq, taxon='eukaryote', rewrite_seq_path=True, protected=True)[source]¶
Convert among Excel (.xlsx), comma separated (.csv), and tab separated (.tsv) file formats
Read a knowledge base from the source files(s) and write it to the destination files(s). A path to a delimiter separated set of knowledge base files must be represented by a Unix glob pattern (with a *) that matches all delimiter separated files.
- Parameters:
source_core (
str) – path to the core of the source knowledge basesource_seq (
str) – path to the genome sequence of the source knowledge basedest_core (
str) – path to save the converted core of the knowledge basedest_seq (
str) – path to save the converted genome sequence of the knowledge basetaxon (
str) – taxonrewrite_seq_path (
bool, optional) – ifTrue, the path to genome sequence in the converted core of the knowledge base will be updated to the path of the converted genome sequenceprotected (
bool, optional) – ifTrue, protect the worksheet
- wc_kb.io.create_template(core_path, seq_path, taxon='eukaryote', write_schema=False, write_toc=True, extra_entries=10, data_repo_metadata=True, protected=True)[source]¶
Create file with knowledge base template, including row and column headings
- Parameters:
core_path (
str) – path to save template of core knowledge baseseq_path (
str) – path to save genome sequencetaxon (
str, optional) – taxonwrite_schema (
bool, optional) – ifTrue, include additional worksheet with schemawrite_toc (
bool, optional) – ifTrue, include additional worksheet with table of contentsextra_entries (
int, optional) – additional entries to displaydata_repo_metadata (
bool, optional) – ifTrue, try to write metadata information about the file’s Git repoprotected (
bool, optional) – ifTrue, protect the worksheet