Records¶
In addition to the Python modules documented below, note that the directory hepdata/modules/records/static/js
contains the JavaScript code that renders the tables and plots in a web browser using the
D3.js library.
API for HEPData-Records. |
|
Jinja utilities for Invenio. |
|
Blueprint for HEPData-Records. |
|
HEPData Subscribers API. |
|
HEPData Subscribers Model. |
|
Update INSPIRE publication information. |
|
YAML Processing Utils. |
hepdata.modules.records.api¶
API for HEPData-Records.
- hepdata.modules.records.api.format_submission(recid, record, version, version_count, hepdata_submission, data_table=None)[source]¶
Performs all the processing of the record to be displayed.
- Parameters:
recid
record
version
version_count
hepdata_submission
data_table
- Returns:
- hepdata.modules.records.api.format_tables(ctx, data_record_query, data_table, recid)[source]¶
Finds all the tables related to a submission and formats them for display in the UI or as JSON.
- Returns:
- hepdata.modules.records.api.format_resource(resource, contents, content_url)[source]¶
Gets info about a resource ready to be displayed on the resource’s landing page
- Parameters:
resource – DataResource object to be displayed
contents – Resource file contents
- Returns:
context dictionary ready for the template
- hepdata.modules.records.api.should_send_json_ld(request)[source]¶
Determine whether to send JSON-LD instead of HTML for this request
- hepdata.modules.records.api.get_commit_message(ctx, recid)[source]¶
Returns a commit message for the current version if present.
- Parameters:
ctx
recid
- hepdata.modules.records.api.create_breadcrumb_text(authors, ctx, record)[source]¶
Creates the breadcrumb text for a submission.
- hepdata.modules.records.api.submission_has_resources(hepsubmission)[source]¶
Returns whether the submission has resources attached.
- Parameters:
hepsubmission – HEPSubmission object
- Returns:
bool
- hepdata.modules.records.api.render_record(recid, record, version, output_format, light_mode=False)[source]¶
- hepdata.modules.records.api.create_new_version(recid, user, notify_uploader=True, uploader_message=None)[source]¶
- hepdata.modules.records.api.process_payload(recid, file, redirect_url, synchronous=False)[source]¶
Process an uploaded file
- Parameters:
recid – int The id of the record to update
file – file The file to process
redirect_url – string Redirect URL to record, for use if the upload fails or in synchronous mode
synchronous – bool Whether to process asynchronously via celery (default) or immediately (only recommended for tests)
- Returns:
JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error).
- (task)hepdata.modules.records.api.process_saved_file(file_path, recid, userid, redirect_url, previous_status)[source]¶
- hepdata.modules.records.api.check_and_convert_from_oldhepdata(input_directory, id, timestamp)[source]¶
Check if the input directory contains a .oldhepdata file and convert it to YAML if it happens.
- hepdata.modules.records.api.assign_or_create_review_status(data_table_metadata, publication_recid, version)[source]¶
If a review already exists, it will be attached to the current data record. If a review does not exist for a data table, it will be created.
- Parameters:
data_table_metadata – the metadata describing the main table.
publication_recid – publication record id
version
- hepdata.modules.records.api.process_data_tables(ctx, data_record_query, first_data_id, data_table=None)[source]¶
- hepdata.modules.records.api.get_all_ids(index=None, id_field='recid', last_updated=None, latest_first=False)[source]¶
Get all record or inspire ids of publications in the search index
- Parameters:
index – name of index to use.
id_field – id type to return. Should be ‘recid’ or ‘inspire_id’
- Returns:
list of integer ids
Queries the database for all HEPSubmission objects contained in this object’s related record ID list. (All submissions this one is relating to)
- Returns:
[list] A list of HEPSubmission objects
Queries the database for all records in the RelatedRecId table that have THIS record’s id as a related record. Then returns the HEPSubmission object marked in the RelatedRecid table. Returns only submissions marked as ‘finished’
- Returns:
[list] List containing related records.
Queries the database for all DataSubmission objects contained in this object’s related DOI list. Only returns an object if associated HEPSubmission status is ‘finished’ (All submissions this one is relating to)
- Parameters:
data_submission – The datasubmission object to find related data for.
- Returns:
[list] A list of DataSubmission objects
Get the DataSubmission Objects with a RelatedTable entry where this doi is referred to in related_doi.
- Parameters:
data_submission – The datasubmission to find the related entries for.
- Returns:
[List] List of DataSubmission objects.
- hepdata.modules.records.api.get_record_data_list(record, data_type)[source]¶
Generates a dictionary (title/recid) from a list of record IDs. This must be done as the record contents are not stored within the hepsubmission object.
- Parameters:
record – The record used for the query.
data_type – Either the related, or related to this data.
- Returns:
[list] A list of dictionary objects containing record ID and title pairs
- hepdata.modules.records.api.get_table_data_list(table, data_type)[source]¶
Generates a list of general information (name, doi, desc) dictionaries of related DataSubmission objects. Will either use the related data list (get_related_data_submissions) OR the related to this list (generated by get_related_to_this_datasubmissions)
- Parameters:
table – The DataSubmission object used for querying.
data_type – The flag to decide which relation data to use.
- Returns:
[list] A list of dictionaries with the name, doi and description of the object.
hepdata.modules.records.ext¶
Jinja utilities for Invenio.
hepdata.modules.records.views¶
Blueprint for HEPData-Records.
- hepdata.modules.records.views.metadata(recid)[source]¶
Queries and returns a data record.
- Parameters:
recid – the record id being queried
- Returns:
renders the record template
- hepdata.modules.records.views.get_latest()[source]¶
Returns the N latest records from the database.
- Parameters:
n
- Returns:
- hepdata.modules.records.views.get_table_data(data_recid, version)[source]¶
Gets the table data only for a specific recid/version.
- Parameters:
data_recid – The data recid used for retrieval
version – The data version to retrieve
- Returns:
- hepdata.modules.records.views.get_table_details(recid, data_recid, version, load_all=1)[source]¶
Get the table details of a given datasubmission.
- Parameters:
recid
data_recid
version
load_all – Whether to perform the filesize check or not when loading (1 will always load the file)
- Returns:
- hepdata.modules.records.views.get_coordinator_view(recid)[source]¶
Returns the coordinator view for a record.
- Parameters:
recid
- hepdata.modules.records.views.get_data_reviews_for_record()[source]¶
Get the data reviews for a record.
- Returns:
json response with reviews (or a json with an error key if not)
- hepdata.modules.records.views.add_data_review_messsage(publication_recid, data_recid)[source]¶
Adds a new review message for a data submission.
- Parameters:
publication_recid
data_recid
- hepdata.modules.records.views.get_all_review_messages(publication_recid)[source]¶
Gets the review messages for a publication id.
- Parameters:
publication_recid
- Returns:
- hepdata.modules.records.views.get_resources(recid, version)[source]¶
Gets a list of resources for a publication, relevant to all data records.
- Parameters:
recid
- Returns:
json
- hepdata.modules.records.views.process_resource(reference)[source]¶
For a submission resource, create the link to the location, or the image file if an image.
- Parameters:
reference
- Returns:
dict
- hepdata.modules.records.views.get_resource(resource_id)[source]¶
Attempts to find any HTML resources to be displayed for a record in the event that it does not have proper data records included.
- Parameters:
recid – publication record id
- Returns:
json dictionary containing any HTML files to show.
- hepdata.modules.records.views.cli_upload()[source]¶
Used by the hepdata-cli tool to upload a submission.
- Returns:
- hepdata.modules.records.views.revise_submission(recid)[source]¶
This method creates a new version of a submission.
- Parameters:
recid – record id to attach the data to
- Returns:
For POST requests, returns JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error). For GET requests, redirects to the record.
- hepdata.modules.records.views.consume_data_payload(recid)[source]¶
This method persists, then presents the loaded data back to the user.
- Parameters:
recid – record id to attach the data to
- Returns:
For POST requests, returns JSONResponse either containing ‘url’ (for success cases) or ‘message’ (for error cases, which will give a 400 error). For GET requests, redirects to the record.
- hepdata.modules.records.views.attach_information_to_record(recid)[source]¶
Given an INSPIRE data representation, this will process the data, and update information for a given record id with the contents.
- Returns:
- hepdata.modules.records.views.consume_sandbox_payload()[source]¶
Creates a new sandbox submission with a new file upload.
- Parameters:
recid
hepdata.modules.records.importer.api¶
- hepdata.modules.records.importer.api.import_records(inspire_ids, synchronous=False, update_existing=False, base_url='https://hepdata.net', send_email=False)[source]¶
Import records from hepdata.net
- Parameters:
inspire_ids – array of inspire ids to load (in the format insXXX).
synchronous – if should be run immediately rather than via celery
update_existing – whether to update records that already exist
base_url – override default base URL
send_email – whether to send emails on finalising submissions
- Returns:
None
- hepdata.modules.records.importer.api.get_inspire_ids(base_url='https://hepdata.net', last_updated=None, n_latest=None)[source]¶
Get inspire IDs from hepdata.net
- Parameters:
last_updated – get IDs of records updated on/after this date
n_latest – get the n most recently updated IDs
base_url – override default base URL
- Returns:
list of integer IDs, or False in the case of errors
hepdata.modules.records.subscribers.api¶
HEPData Subscribers API.
hepdata.modules.records.subscribers.models¶
HEPData Subscribers Model.
hepdata.modules.records.subscribers.rest¶
hepdata.modules.records.utils.analyses¶
hepdata.modules.records.utils.common¶
- hepdata.modules.records.utils.common.find_file_in_directory(directory, file_predicate)[source]¶
Finds a file in a directory. Useful for say when the submission.yaml file is not at the top level of the unzipped archive but one or more levels below.
- Parameters:
directory
file_predicate – a lambda that checks if it’s the file you’re looking for
- Returns:
- hepdata.modules.records.utils.common.truncate_string(string, max_words=None, max_chars=None)[source]¶
- hepdata.modules.records.utils.common.get_record_contents(recid, status=None)[source]¶
Tries to get record from OpenSearch first. Failing that, it tries from the database.
- Parameters:
recid – Record ID to get.
status – Status of submission. If provided and not ‘finished’, will not check opensearch first.
- Returns:
a dictionary containing the record contents if the recid exists, None otherwise.
- hepdata.modules.records.utils.common.load_table_data(recid, version)[source]¶
Loads a specfic data file’s yaml file data.
- Parameters:
recid – The recid used for the query
version – The data version to select
- Return table_contents:
A dict containing the table data
- hepdata.modules.records.utils.common.file_size_check(file_location, load_all)[source]¶
- Decides if a file breaks the maximum size threshold
for immediate loading on the records page.
- Parameters:
file_location – Location of the data file on disk
load_all – If the check should be run
- Return bool:
Pass or fail
hepdata.modules.records.utils.data_processing_utils¶
- hepdata.modules.records.utils.data_processing_utils.pad_independent_variables(table_contents)[source]¶
Pads out the independent variable column in the event that nothing exists.
- Parameters:
table_contents
- Returns:
- hepdata.modules.records.utils.data_processing_utils.fix_nan_inf(value)[source]¶
Converts NaN, +inf, and -inf values to strings.
- Parameters:
value
- Returns:
- hepdata.modules.records.utils.data_processing_utils.process_independent_variables(table_contents, x_axes, independent_variable_headers)[source]¶
- hepdata.modules.records.utils.data_processing_utils.process_dependent_variables(group_count, record, table_contents, tmp_values, independent_variables, dependent_variable_headers)[source]¶
- hepdata.modules.records.utils.data_processing_utils.generate_table_data(table_contents)[source]¶
Creates a renderable data table structure.
- Parameters:
table_contents
- Returns:
A dictionary containing the table headers/values
hepdata.modules.records.utils.doi_minter¶
- (task)hepdata.modules.records.utils.doi_minter.generate_doi_for_table(doi)[source]¶
Generate DOI for a specific table given by its doi.
- Parameters:
doi
- Returns:
- (task)hepdata.modules.records.utils.doi_minter.generate_dois_for_submission(*args, **kwargs)[source]¶
Generate DOIs for all the submission components.
- Parameters:
args
kwargs
- Returns:
- (task)hepdata.modules.records.utils.doi_minter.create_container_doi(hep_submission_id, data_submission_ids, resource_ids, site_url)[source]¶
Creates the payload to wrap the whole submission.
- Parameters:
hep_submission
data_submissions
resource_ids
publication_info
- Returns:
- (task)hepdata.modules.records.utils.doi_minter.create_data_doi(hep_submission_id, data_submission_id, site_url)[source]¶
Generate DOI record for a data record.
- Parameters:
data_submission_id
version
- Returns:
- (task)hepdata.modules.records.utils.doi_minter.create_resource_doi(hep_submission_id, resource_id, site_url)[source]¶
Generate DOI record for a data resource
- Parameters:
resource_id
version
- Returns:
- hepdata.modules.records.utils.doi_minter.reserve_doi_for_hepsubmission(hepsubmission, update=False)[source]¶
- hepdata.modules.records.utils.doi_minter.reserve_dois_for_data_submissions(*args, **kwargs)[source]¶
Reserves a DOI for a data submission and saves to the datasubmission object.
- Parameters:
data_submission – DataSubmission object representing a data table.
- Returns:
- hepdata.modules.records.utils.doi_minter.reserve_dois_for_resources(publication_recid, version, resources=None)[source]¶
Reserves a DOI for a data submission and saves to the datasubmission object.
- Parameters:
resources – list of DataResource objects
- Returns:
hepdata.modules.records.utils.old_hepdata¶
hepdata.modules.records.utils.records_update_utils¶
Update INSPIRE publication information.
- (task)hepdata.modules.records.utils.records_update_utils.update_record_info(inspire_id, send_email=False)[source]¶
Update publication information from INSPIRE for a specific record.
- (task)hepdata.modules.records.utils.records_update_utils.update_records_info_since(date)[source]¶
Update publication information from INSPIRE for all records updated since a certain date.
- (task)hepdata.modules.records.utils.records_update_utils.update_records_info_on(date)[source]¶
Update publication information from INSPIRE for all records updated on a certain date.
- (task)hepdata.modules.records.utils.records_update_utils.update_all_records_info()[source]¶
Update publication information from INSPIRE for all records.
hepdata.modules.records.utils.submission¶
- hepdata.modules.records.utils.submission.remove_submission(record_id, version=1)[source]¶
Removes the database entries and data files related to a record.
- Parameters:
record_id
version
- Returns:
True if Successful, False if the record does not exist.
- hepdata.modules.records.utils.submission.cleanup_submission(recid, version, to_keep)[source]¶
Removes old datasubmission records from the database. This ensures that when users replace a submission, previous records are not left behind in the database.
- Parameters:
recid – publication recid of parent
version – version number of record
to_keep – an array of names to keep in the submission
- Returns:
- hepdata.modules.records.utils.submission.cleanup_data_resources(data_submission)[source]¶
Removes additional resources for a datasubmission from the database to avoid duplications. This ensures that when users replace a submission, old resources are not left behind in the database.
- Parameters:
data_submission – DataSubmission object to be cleaned
- Returns:
- hepdata.modules.records.utils.submission.cleanup_data_keywords(data_submission)[source]¶
Removes keywords from the database to avoid duplications. This ensures that when users replace a submission, old keywords are not left behind in the database.
- Parameters:
data_submission – DataSubmission object to be cleaned
- Returns:
Deletes all related record ID entries of a HEPSubmission object of a given recid
- Parameters:
recid – The record ID of the HEPSubmission object to be cleaned
- Returns:
- hepdata.modules.records.utils.submission.process_data_file(recid, version, basepath, data_obj, datasubmission, main_file_path, tablenum, overall_status)[source]¶
Takes a data file and any supplementary files and persists their metadata to the database whilst recording their upload path.
- Parameters:
recid – the record id
version – version of the resource to be stored
basepath – the path the submission has been loaded to
data_obj – Object representation of loaded YAML file
datasubmission – the DataSubmission object representing this file in the DB
main_file_path – the data file path
tablenum – This table’s number in the submission.
overall_status – Overall status of submission to use for sandbox filtering.
- Returns:
- hepdata.modules.records.utils.submission.process_general_submission_info(basepath, submission_info_document, recid)[source]¶
Processes the top level information about a submission, extracting the information about the data abstract, additional resources for the submission (files, links, and html inserts) and historical modification information.
- Parameters:
basepath – the path the submission has been loaded to
submission_info_document – the data document
recid
- Returns:
- hepdata.modules.records.utils.submission.parse_additional_resources(basepath, recid, yaml_document)[source]¶
Parses out the additional resource section for a full submission.
- Parameters:
basepath – the path the submission has been loaded to
recid
yaml_document
- Returns:
- hepdata.modules.records.utils.submission.parse_modifications(hepsubmission, recid, submission_info_document)[source]¶
- hepdata.modules.records.utils.submission.process_submission_directory(basepath, submission_file_path, recid, update=False, old_schema=False)[source]¶
Goes through an entire submission directory and processes the files within to create DataSubmissions with the files and related material attached as DataResources.
- Parameters:
basepath
submission_file_path
recid
update
old_schema – whether to use old (v0) submission and data schemas (should only be used when importing old records)
- Returns:
- hepdata.modules.records.utils.submission.package_submission(basepath, recid, hep_submission_obj)[source]¶
Zips up a submission directory. This is in advance of its download for example by users.
- Parameters:
basepath – path of directory containing all submission files
recid – the publication record ID
hep_submission_obj – the HEPSubmission object representing the overall position
- hepdata.modules.records.utils.submission.clean_error_message_for_display(error_message, dir)[source]¶
- hepdata.modules.records.utils.submission.get_or_create_hepsubmission(recid, coordinator=1, status='todo')[source]¶
Gets or creates a new HEPSubmission record.
- Parameters:
recid – the publication record id
coordinator – the user id of the user who owns this record
status – e.g. todo, finished.
- Returns:
the newly created HEPSubmission object
- hepdata.modules.records.utils.submission.create_data_review(data_recid, publication_recid, version=1)[source]¶
Creates a new data review given a data record id and a publication record id.
- Parameters:
data_recid
publication_recid
version
- Returns:
- hepdata.modules.records.utils.submission.do_finalise(recid, publication_record=None, force_finalise=False, commit_message=None, send_tweet=False, update=False, convert=True, send_email=True)[source]¶
Creates record SIP for each data record with a link to the associated publication.
- Parameters:
recid (int) – publication_recid of HEPSubmission to finalise
publication_record (HEPSubmission) – HEPSubmission object to finalise
force_finalise (bool) – Whether to force finalisation. If False, will only finalise if logged-in user is the submission coordinator. Should only be set to True for admin tasks/testing.
commit_message (str) – Version information for updated versions of a submission.
send_tweet (bool) – Whether to tweet about the new submission.
update (bool) – Whether to update the existing data records rather than create new ones (only used for admin/test purposes)
convert (bool) – Whether to convert to (and store) other data formats using hepdata_converter
send_email (bool) – Whether to email the submission participants and coordinator to inform them that the submission is complete
- Returns:
JSON string with keys:
success
,recid
, (on success)data_count
,generated_records
, (on failure)errors
.- Return type:
hepdata.modules.records.utils.users¶
hepdata.modules.records.utils.workflow¶
- hepdata.modules.records.utils.workflow.create_data_structure(ctx)[source]¶
The data structures need to be normalised before being stored in the database. This is performed here.
- Parameters:
ctx – record information as a dictionary
- Returns:
a cleaned up representation.
- hepdata.modules.records.utils.workflow.update_record(recid, ctx)[source]¶
Updates a record given a new dictionary.
- Parameters:
recid
ctx
- Returns:
hepdata.modules.records.utils.yaml_utils¶
YAML Processing Utils.
- hepdata.modules.records.utils.yaml_utils.write_submission_yaml_block(document, submission_yaml, type='info')[source]¶
- hepdata.modules.records.utils.yaml_utils.split_files(file_location, output_location)[source]¶
- Parameters:
file_location – input yaml file location
output_location – output directory path
- hepdata.modules.records.utils.yaml_utils.cleanup_data_yaml(yaml)[source]¶
Casts strings to numbers where possible.
- Parameters:
yaml
- Returns: