Difference between revisions of "Exporting ratings"

From WOT Wiki
Jump to: navigation, search
(Created page with '== Downloading ratings and comments == You can export your own ratings and comments on the "My ratings" tab in [http://www.mywot.com/user your profile]. The data is updated once …')
 
(Undo revision 19361 by Deletе (talk))
 
(16 intermediate revisions by 9 users not shown)
Line 1: Line 1:
 
== Downloading ratings and comments ==
 
== Downloading ratings and comments ==
You can export your own ratings and comments on the "My ratings" tab in [http://www.mywot.com/user your profile]. The data is updated once a day and the date of last update is shown next to the "Download" button.
+
You can export your own ratings and comments on the "My ratings" tab in [http://www.mywot.com/user your profile].  
 +
 
 +
=== Renewal ===
 +
The data is updated within 24 hours after a rating is recorded. The date of last update is shown next to the "Download" button.
 +
 
 +
=== Exported file format ===
 +
The export file is in [http://en.wikipedia.org/wiki/XML XML format].
 +
 
 +
<pre>You'll need an XML/RSS viewer / reader / editor.</pre>
 +
 
 +
If the export file is larger than 1 [http://en.wikipedia.org/wiki/Kibibyte kiB], it's compressed to a [https://en.wikipedia.org/wiki/Zip_(file_format)  ZIP file].  An archive utility may be required to extract the XML file from the ZIP file, such as:
 +
* [http://www.pkware.com/ PkZip]
 +
* [http://www.7-zip.org/ 7-Zip]
 +
* [http://www.winzip.com/ WinZip]
  
 
== Data format ==
 
== Data format ==
The data is in an XML format. If the export file is larger than 1 [http://en.wikipedia.org/wiki/Kibibyte kiB], it's compressed to a ZIP file. Here's an annotated example:
+
The data is in an XML format. Here is an annotated example:
  
 
   <span style="color: green;">''&lt;!-- character encoding is always UTF-8 --&gt;''</span>
 
   <span style="color: green;">''&lt;!-- character encoding is always UTF-8 --&gt;''</span>
Line 14: Line 27:
 
    
 
    
 
     <span style="color: green;">''&lt;!-- '''target''': one element for each rated or commented target --&gt;''</span>   
 
     <span style="color: green;">''&lt;!-- '''target''': one element for each rated or commented target --&gt;''</span>   
     <span style="color: green;">''&lt;!--  name: target name (host name, IP address), [http://en.wikipedia.org/wiki/Internationalized_domain_name IDN] are converted according to [http://www.rfc-editor.org/rfc/rfc3490.txt RFC 3490] (string) --&gt;''</span>
+
     <span style="color: green;">''&lt;!--  name: [[#Target names|target name]] (string) --&gt;''</span>
 
    
 
    
 
     <'''target''' name="<span style="color: blue;">example.com</span>">
 
     <'''target''' name="<span style="color: blue;">example.com</span>">
Line 42: Line 55:
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;"> 6</span> = • Good customer experience --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;"> 6</span> = • Good customer experience --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;"> 7</span> = • Child friendly --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;"> 7</span> = • Child friendly --&gt;''</span>
      <span style="color: green;">''&lt;!--    <span style="color: blue;">20</span> = • Seal holder --&gt;''</span>
 
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">21</span> = • Good site --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">21</span> = • Good site --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;"> 8</span> = <span style="color: red;">•</span> Spam --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;"> 8</span> = <span style="color: red;">•</span> Spam --&gt;''</span>
Line 54: Line 66:
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">16</span> = <span style="color: red;">•</span> Hateful, violent or illegal content --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">16</span> = <span style="color: red;">•</span> Hateful, violent or illegal content --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">22</span> = <span style="color: red;">•</span> Ethical issues --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">22</span> = <span style="color: red;">•</span> Ethical issues --&gt;''</span>
      <span style="color: green;">''&lt;!--    <span style="color: blue;">17</span> = <span style="color: gray;">•</span> References found --&gt;''</span>
 
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">18</span> = <span style="color: gray;">•</span> Useless --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">18</span> = <span style="color: gray;">•</span> Useless --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">19</span> = <span style="color: gray;">•</span> Other --&gt;''</span>
 
       <span style="color: green;">''&lt;!--    <span style="color: blue;">19</span> = <span style="color: gray;">•</span> Other --&gt;''</span>
 
       <span style="color: green;">''&lt;!--  time: last changed ([http://en.wikipedia.org/wiki/ISO_8601 ISO 8601] time) --&gt;''</span>  
 
       <span style="color: green;">''&lt;!--  time: last changed ([http://en.wikipedia.org/wiki/ISO_8601 ISO 8601] time) --&gt;''</span>  
 
    
 
    
       <'''comment''' category="<span style="color: blue;">9</span>" time=time="<span style="color: blue;">2008-04-23 16:37:48+00</span>">
+
       <'''comment''' category="<span style="color: blue;">9</span>" time="<span style="color: blue;">2008-04-23 16:37:48+00</span>">
 
         <span style="color: green;">''&lt;!-- comment text is always in a CDATA element --&gt;''</span>  
 
         <span style="color: green;">''&lt;!-- comment text is always in a CDATA element --&gt;''</span>  
 
         <!CDATA[<span style="color: blue;">Comment text.</span>]]>
 
         <!CDATA[<span style="color: blue;">Comment text.</span>]]>
Line 69: Line 80:
 
     <span style="color: green;">''&lt;!-- ... more target elements for each rated or commented host ... --&gt;''</span>
 
     <span style="color: green;">''&lt;!-- ... more target elements for each rated or commented host ... --&gt;''</span>
 
   </'''wot'''>
 
   </'''wot'''>
 +
 +
=== Target names ===
 +
Target names are typically host names or IP addresses. [http://en.wikipedia.org/wiki/Internationalized_domain_name Internationalized domain names] (IDN) are converted to ASCII according to [http://www.rfc-editor.org/rfc/rfc3490.txt RFC 3490]. For certain shared hosts (e.g. twitter.com), the target name may also contain part of the path encoded as a subdomain. The encoded subdomain is always lowest in the hierarchy and starts with <tt>_p_</tt> followed by the [http://www.rfc-editor.org/rfc/rfc3548.txt RFC 3548] compliant [http://en.wikipedia.org/wiki/Base32 Base32] encoded path. For example:
 +
<span style="color: green;">_p_</span><span style="color: blue;">k5swex3pmzpvi4tvon2a</span>.twitter.com = twitter.com/<span style="color: blue;">Web_of_Trust</span>
  
 
== Processing large export files ==
 
== Processing large export files ==
 
Many text editors or XML parsers try to load the entire file into memory, which causes problems for larger export files. You can split the XML file into smaller chunks using [http://www.mywot.com/files/misc/splitwot.pl.txt this Perl script] (or any other XML splitter). For example, to split the file into chunks each containing at most 10000 targets, run the command <tt>perl splitwot.pl export.xml 10000</tt>. If you are using Windows, try [http://www.activestate.com/activeperl/ ActivePerl].
 
Many text editors or XML parsers try to load the entire file into memory, which causes problems for larger export files. You can split the XML file into smaller chunks using [http://www.mywot.com/files/misc/splitwot.pl.txt this Perl script] (or any other XML splitter). For example, to split the file into chunks each containing at most 10000 targets, run the command <tt>perl splitwot.pl export.xml 10000</tt>. If you are using Windows, try [http://www.activestate.com/activeperl/ ActivePerl].
 +
 +
[[Category:Technical Details]]

Latest revision as of 14:05, 9 October 2016

Downloading ratings and comments

You can export your own ratings and comments on the "My ratings" tab in your profile.

Renewal

The data is updated within 24 hours after a rating is recorded. The date of last update is shown next to the "Download" button.

Exported file format

The export file is in XML format.

You'll need an XML/RSS viewer / reader / editor.

If the export file is larger than 1 kiB, it's compressed to a ZIP file. An archive utility may be required to extract the XML file from the ZIP file, such as:

Data format

The data is in an XML format. Here is an annotated example:

 <!-- character encoding is always UTF-8 -->
 <?xml version="1.0" encoding="UTF-8"?>
 
 <!-- wot: the root element -->
 <!--   uid: your account id (integer) -->
 
 <wot uid="1">
 
   <!-- target: one element for each rated or commented target -->  
   <!--   name: target name (string) -->
 
   <target name="example.com">
 
     <!-- rating: contains rating components (optional) -->
     <!--   time: last changed (ISO 8601 time) -->
 
     <rating time="2006-09-26 19:16:47+00">
 
       <!-- component: one for each rated component (unrated components not included) -->
       <!--   name: component id (integer ∊ [0, 100]) -->
       <!--     0 = Trustworthiness -->
       <!--     1 = Vendor reliability -->
       <!--     2 = Privacy -->
       <!--     4 = Child safety -->
       <!--   rating: rating value (integer) -->
 
       <component name="0" rating="90"/>
       <component name="4" rating="90"/>
 
     </rating>
 
     <!-- comment: one for each comment (optional) -->
     <!--   category: comment category (integer) -->
     <!--      4 = • Useful, informative -->
     <!--      5 = • Entertaining -->
     <!--      6 = • Good customer experience -->
     <!--      7 = • Child friendly -->
     <!--     21 = • Good site -->
     <!--      8 =  Spam -->
     <!--      9 =  Annoying ads or popups -->
     <!--     10 =  Bad customer experience -->
     <!--     11 =  Phishing or other scams -->
     <!--     12 =  Malicious content, viruses -->
     <!--     13 =  Browser exploit -->
     <!--     14 =  Spyware or adware -->
     <!--     15 =  Adult content -->
     <!--     16 =  Hateful, violent or illegal content -->
     <!--     22 =  Ethical issues -->
     <!--     18 =  Useless -->
     <!--     19 =  Other -->
     <!--   time: last changed (ISO 8601 time) --> 
 
     <comment category="9" time="2008-04-23 16:37:48+00">
       <!-- comment text is always in a CDATA element --> 
       <!CDATA[Comment text.]]>
     </comment>
 
     <!-- ... more comment elements ... -->
   </target>
 
   <!-- ... more target elements for each rated or commented host ... -->
 </wot>

Target names

Target names are typically host names or IP addresses. Internationalized domain names (IDN) are converted to ASCII according to RFC 3490. For certain shared hosts (e.g. twitter.com), the target name may also contain part of the path encoded as a subdomain. The encoded subdomain is always lowest in the hierarchy and starts with _p_ followed by the RFC 3548 compliant Base32 encoded path. For example:

_p_k5swex3pmzpvi4tvon2a.twitter.com = twitter.com/Web_of_Trust

Processing large export files

Many text editors or XML parsers try to load the entire file into memory, which causes problems for larger export files. You can split the XML file into smaller chunks using this Perl script (or any other XML splitter). For example, to split the file into chunks each containing at most 10000 targets, run the command perl splitwot.pl export.xml 10000. If you are using Windows, try ActivePerl.