2007-07-05

Google, google - save our soils.

There is a lot of computer dictionaries, there is a lot of programs, which easy translate a word when the mouse is moved to it, there is a lot of on-line dictionaries and translators, it's all great and wonderful.

But I'd like to have a list of everything I translate, to see the statistics in order to learn by heart the most often forgotten words after all. May be some flash cards should be formed by them, it seems there is such way of words learning already.

I need such web service, I need corresponding script for greasemonkey I need corresponding dictionary for my computer with the necessary possibility of quickly translation a word under the mouse, I need pdf dictionary integrated with a book reader... and so that all this has a common base of made requests.
Is there anything similar?

So far I've created a little script for we, using http://lingvo.yandex.ru. As least, I'll bind a database to it, it will be happiness yet.

The code
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys, urllib2, libxslt, libxml2
from html5lib import html5parser

# http://www.dbforums.com/archive/index.php/t-1419974.html
# by John J. Lee
class DumbProxyPasswordMgr:
def __init__(self):
self.user = self.passwd = None
def add_password(self, realm, uri, user, passwd):
self.user = user
self.passwd = passwd
def find_user_password(self, realm, authuri):
return self.user, self.passwd

def word2url(word):
return "http://lingvo.yandex.ru/en?text=%s"%word

def download(url):
proxy= urllib2.ProxyHandler({"http" : "http://proxy:8080"})
proxy_auth_handler = urllib2.ProxyBasicAuthHandler(DumbProxyPasswordMgr ())
proxy_auth_handler.add_password(None, None, 'user', 'password')
opener = urllib2.build_opener(proxy,proxy_auth_handler)
urllib2.install_opener(opener)
src = urllib2.urlopen(url)
return src.read()

def html2xml(src):
p = html5parser.HTMLParser()
doc = p.parse(src,'utf-8')
return doc.toxml()


def decodeYaXML(src):
styledoc = libxml2.parseFile("decode.xslt")
style = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseDoc(src)
result = style.applyStylesheet(doc, None)
content = style.saveResultToString(result)
style.freeStylesheet()
doc.freeDoc()
result.freeDoc()
return content

if __name__ == '__main__':
if len(sys.argv) < 2:
print "usage %s word"%sys.argv[0]
sys.exit()
word = sys.argv[1]
result = decodeYaXML(html2xml(download(word2url(word))))
print result.decode('utf-8','replace').encode('cp866','replace')



The contents of decode.xslt file is trivial everybody can write it to one's heart's content.

Monsieur being a good judge of perversion can appreciate the last row.

No comments:

Post a Comment