|
354 | 354 | <span class="gp">>>> </span><span class="n">record</span> <span class="o">=</span> <span class="n">Entrez</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">stream</span><span class="p">)</span> |
355 | 355 | </pre></div> |
356 | 356 | </div> |
357 | | -<p>Now <code class="docutils literal notranslate"><span class="pre">record</span></code> is a dictionary with exactly one key:</p> |
| 357 | +<p>Now <code class="docutils literal notranslate"><span class="pre">record</span></code> is a dictionary with three keys:</p> |
358 | 358 | <div class="highlight-pycon notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">record</span><span class="o">.</span><span class="n">keys</span><span class="p">()</span> |
359 | | -<span class="go">dict_keys(['DbList'])</span> |
| 359 | +<span class="go">dict_keys(['DbInfo', 'ERROR', 'DbList'])</span> |
360 | 360 | </pre></div> |
361 | 361 | </div> |
362 | | -<p>The values stored in this key is the list of database names shown in the |
363 | | -XML above:</p> |
| 362 | +<p>The values stored in the <code class="docutils literal notranslate"><span class="pre">'DbList</span></code> key is the list of database |
| 363 | +names shown in the XML above:</p> |
364 | 364 | <div class="highlight-pycon notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">record</span><span class="p">[</span><span class="s2">"DbList"</span><span class="p">]</span> |
365 | 365 | <span class="go">['pubmed', 'protein', 'nucleotide', 'nuccore', 'nucgss', 'nucest',</span> |
366 | 366 | <span class="go"> 'structure', 'genome', 'books', 'cancerchromosomes', 'cdd', 'gap',</span> |
|
376 | 376 | <span class="gp">>>> </span><span class="n">Entrez</span><span class="o">.</span><span class="n">email</span> <span class="o">=</span> <span class="s2">"A.N.Other@example.com"</span> <span class="c1"># Always tell NCBI who you are</span> |
377 | 377 | <span class="gp">>>> </span><span class="n">stream</span> <span class="o">=</span> <span class="n">Entrez</span><span class="o">.</span><span class="n">einfo</span><span class="p">(</span><span class="n">db</span><span class="o">=</span><span class="s2">"pubmed"</span><span class="p">)</span> |
378 | 378 | <span class="gp">>>> </span><span class="n">record</span> <span class="o">=</span> <span class="n">Entrez</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">stream</span><span class="p">)</span> |
379 | | -<span class="gp">>>> </span><span class="n">record</span><span class="p">[</span><span class="s2">"DbInfo"</span><span class="p">][</span><span class="s2">"Description"</span><span class="p">]</span> |
| 379 | +<span class="gp">>>> </span><span class="n">record</span><span class="p">[</span><span class="s2">"DbInfo"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"Description"</span><span class="p">]</span> |
380 | 380 | <span class="go">'PubMed bibliographic record'</span> |
381 | 381 | </pre></div> |
382 | 382 | </div> |
383 | | -<div class="highlight-pycon notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">record</span><span class="p">[</span><span class="s2">"DbInfo"</span><span class="p">][</span><span class="s2">"Count"</span><span class="p">]</span> |
| 383 | +<div class="highlight-pycon notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="n">record</span><span class="p">[</span><span class="s2">"DbInfo"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"Count"</span><span class="p">]</span> |
384 | 384 | <span class="go">'17989604'</span> |
385 | | -<span class="gp">>>> </span><span class="n">record</span><span class="p">[</span><span class="s2">"DbInfo"</span><span class="p">][</span><span class="s2">"LastUpdate"</span><span class="p">]</span> |
| 385 | +<span class="gp">>>> </span><span class="n">record</span><span class="p">[</span><span class="s2">"DbInfo"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"LastUpdate"</span><span class="p">]</span> |
386 | 386 | <span class="go">'2008/05/24 06:45'</span> |
387 | 387 | </pre></div> |
388 | 388 | </div> |
@@ -993,59 +993,20 @@ <h3>The file ends prematurely or is otherwise corrupted<a class="headerlink" hre |
993 | 993 | </section> |
994 | 994 | <section id="the-file-contains-items-that-are-missing-from-the-associated-dtd"> |
995 | 995 | <h3>The file contains items that are missing from the associated DTD<a class="headerlink" href="#the-file-contains-items-that-are-missing-from-the-associated-dtd" title="Link to this heading"></a></h3> |
996 | | -<p>This is an example of an XML file containing tags that do not have a |
997 | | -description in the corresponding DTD file:</p> |
998 | | -<div class="highlight-text notranslate"><div class="highlight"><pre><span></span><?xml version="1.0"?> |
999 | | -<!DOCTYPE eInfoResult PUBLIC "-//NLM//DTD eInfoResult, 11 May 2002//EN" "https://www.ncbi.nlm.nih.gov/entrez/query/DTD/eInfo_020511.dtd"> |
1000 | | -<eInfoResult> |
1001 | | - <DbInfo> |
1002 | | - <DbName>pubmed</DbName> |
1003 | | - <MenuName>PubMed</MenuName> |
1004 | | - <Description>PubMed bibliographic record</Description> |
1005 | | - <Count>20161961</Count> |
1006 | | - <LastUpdate>2010/09/10 04:52</LastUpdate> |
1007 | | - <FieldList> |
1008 | | - <Field> |
1009 | | -... |
1010 | | - </Field> |
1011 | | - </FieldList> |
1012 | | - <DocsumList> |
1013 | | - <Docsum> |
1014 | | - <DsName>PubDate</DsName> |
1015 | | - <DsType>4</DsType> |
1016 | | - <DsTypeName>string</DsTypeName> |
1017 | | - </Docsum> |
1018 | | - <Docsum> |
1019 | | - <DsName>EPubDate</DsName> |
1020 | | -... |
1021 | | - </DbInfo> |
1022 | | -</eInfoResult> |
1023 | | -</pre></div> |
1024 | | -</div> |
1025 | | -<p>In this file, for some reason the tag <code class="docutils literal notranslate"><span class="pre"><DocsumList></span></code> (and several |
1026 | | -others) are not listed in the DTD file <code class="docutils literal notranslate"><span class="pre">eInfo_020511.dtd</span></code>, which is |
1027 | | -specified on the second line as the DTD for this XML file. By default, |
1028 | | -the parser will stop and raise a ValidationError if it cannot find some |
1029 | | -tag in the DTD:</p> |
1030 | | -<div class="highlight-pycon notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">from</span><span class="w"> </span><span class="nn">Bio</span><span class="w"> </span><span class="kn">import</span> <span class="n">Entrez</span> |
1031 | | -<span class="gp">>>> </span><span class="n">stream</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"einfo3.xml"</span><span class="p">,</span> <span class="s2">"rb"</span><span class="p">)</span> |
1032 | | -<span class="gp">>>> </span><span class="n">record</span> <span class="o">=</span> <span class="n">Entrez</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">stream</span><span class="p">)</span> |
1033 | | -<span class="gt">Traceback (most recent call last):</span> |
1034 | | -<span class="w"> </span><span class="c">...</span> |
1035 | | -<span class="gr">Bio.Entrez.Parser.ValidationError</span>: <span class="n">Failed to find tag 'DocsumList' in the DTD. To skip all tags that are not represented in the DTD, please call Bio.Entrez.read or Bio.Entrez.parse with validate=False.</span> |
1036 | | -</pre></div> |
1037 | | -</div> |
1038 | | -<p>Optionally, you can instruct the parser to skip such tags instead of |
1039 | | -raising a ValidationError. This is done by calling <code class="docutils literal notranslate"><span class="pre">Entrez.read</span></code> or |
1040 | | -<code class="docutils literal notranslate"><span class="pre">Entrez.parse</span></code> with the argument <code class="docutils literal notranslate"><span class="pre">validate</span></code> equal to False:</p> |
| 996 | +<p>In some extraordinary cases, the XML file is not fully consistent with the |
| 997 | +associated DTD file, and contains some tags that are not defined in the |
| 998 | +DTD file. By default, the parser will stop and raise a <code class="docutils literal notranslate"><span class="pre">ValidationError</span></code> |
| 999 | +if it cannot find some tag in the DTD. You can instruct the parser to skip |
| 1000 | +such tags instead of raising a <code class="docutils literal notranslate"><span class="pre">ValidationError</span></code> by calling <code class="docutils literal notranslate"><span class="pre">Entrez.read</span></code> |
| 1001 | +or <code class="docutils literal notranslate"><span class="pre">Entrez.parse</span></code> with the argument <code class="docutils literal notranslate"><span class="pre">validate</span></code> equal to False:</p> |
1041 | 1002 | <div class="highlight-pycon notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">from</span><span class="w"> </span><span class="nn">Bio</span><span class="w"> </span><span class="kn">import</span> <span class="n">Entrez</span> |
1042 | | -<span class="gp">>>> </span><span class="n">stream</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"einfo3.xml"</span><span class="p">,</span> <span class="s2">"rb"</span><span class="p">)</span> |
| 1003 | +<span class="gp">>>> </span><span class="n">stream</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s2">"myxmlfile.xml"</span><span class="p">,</span> <span class="s2">"rb"</span><span class="p">)</span> |
1043 | 1004 | <span class="gp">>>> </span><span class="n">record</span> <span class="o">=</span> <span class="n">Entrez</span><span class="o">.</span><span class="n">read</span><span class="p">(</span><span class="n">stream</span><span class="p">,</span> <span class="n">validate</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span> |
1044 | 1005 | <span class="gp">>>> </span><span class="n">stream</span><span class="o">.</span><span class="n">close</span><span class="p">()</span> |
1045 | 1006 | </pre></div> |
1046 | 1007 | </div> |
1047 | 1008 | <p>Of course, the information contained in the XML tags that are not in the |
1048 | | -DTD are not present in the record returned by <code class="docutils literal notranslate"><span class="pre">Entrez.read</span></code>.</p> |
| 1009 | +DTD will not be present in the record returned by <code class="docutils literal notranslate"><span class="pre">Entrez.read</span></code>.</p> |
1049 | 1010 | </section> |
1050 | 1011 | <section id="the-file-contains-an-error-message"> |
1051 | 1012 | <h3>The file contains an error message<a class="headerlink" href="#the-file-contains-an-error-message" title="Link to this heading"></a></h3> |
|
0 commit comments