Fork me on GitHub

Outils pour utilisateurs

Outils du site


How to create a PKB (Publisher Knowledge Base)


The platform_AllTitless_YYYY-MM-DD.txt file contains the mappings between proprietary identifiers (used by the content provider on its platform) and ISSNs or other standard identifier. The KBART title_id field is used to match with the print_identifier or the online_identifier. The full list of KBART fields and their meanings is available online.

There can be several PKB files* for the same platform, that can be:

  • retrieved form the content provider's website as one or more KBART files
  • automatically generated with a scraper
  • manually edited
publication_title	print_identifier	online_identifier	date_first_issue_online	num_first_vol_online	num_first_issue_online	date_last_issue_online	num_last_vol_online	num_last_issue_online	title_url	first_author	title_id	embargo_info	coverage_depth	coverage_notes	publisher_name	publication_type	date_monograph_published_print	date_monograph_published_online	monograph_volume	monograph_edition	first_editor	parent_publication_title_id	preceding_publication_title_id	access_type
ACS Applied Materials & Interfaces	1944-8244	1944-8252		aamick													
Analytical Chemistry	0003-2700	1520-6882		ac													

(*): in that case, they are all taken in account by ezPAARSE and the title_id identifier must be unique. The pkbvalidator utility program is there to check this uniqueness among all the files.

Post-enrichment of the PKB file

The PKB file has a name like platform_AllTitles_YYYY-MM-DD.txt

It is quite possible to work step by step to create a PKB and use ezPAARSE to help in this incremental approach.

Par exemple, dans le cas où l'analyse de logs conduite par ezPAARSE signale que certaines lignes ne sont pas reconnues à cause de lacunes dans les pkb, il suffit de récupérer le fichier lines-pkb-miss-ecs.log (en cliquant sur le lien “PKBs manquantes” de la page de résultat d'un traitement FIXME insérer ici une impression d'écran) et de s'en aider pour compléter une ou plusieurs pkb.

In the case where the log analysis conducted by ezPAARSE reports that some lines are not recognized (because the PKB is not complete), you simply have to retrieve the lines-pkb-miss-ecs.log file (by clicking the link “PKBs missing” from the results page of a treatment FIXME insert here a print screen) and use it to complete one or more PKBs.

Knowledge Base Validation

The knowledge bases are loaded by ezPAARSE and their structure has to be checked first with the pkbvalidator command.

platforms/contribute/pkb_en.txt · Dernière modification: 2014/05/09 08:34 par porquet