Phonetic Symbol Generator

为了辅助英语口语练习,花了一天时间捣鼓了段代码自动生成所给文章的音标和三线。本文简单讲述了实现的过程。用到了网页爬虫,python,JavaScript的知识。为了练习英文写作能力,后续内容用英语书写。

In order to make practicing oral English easier, I develope the Phonetic Symbol Genereator. This article discribe the process of how I develop the Phonetic Symbol Generator which you can download HERE.

How to use it

  1. open “phonetic-symbol-generator.html” via browser.
  2. Input your article in the box, and click “Convert”. When the phonetic symbols appear, you can click it to change the position vertically.
  3. If the phonetic symbol is incorrect, you can click the word above the line to modify it.
  4. Enjoy it!

How do I develop it

  • Collect the most frequently used words.
Category Source
Basic words words required in primary school/middle school/high school/CET-4/CET-6/TOEFL/IELTS/GRE
Special form plural(like phenomenon-phenomena), past tense and past participle(like give-gave-given), comparative degree and superlative degree(bad-worse-worst)
Name frequently used names
Place popular cities
  • Use python, including urllib2 and re models, to grab phonetic symbols of the words I collected.

    Get the html content:

      def get_url_content(url):
          i_headers = {"User-Agent": "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.1) Gecko/20090624 Firefox/3.5", "Referer": 'http://www.baidu.com'}
          req = urllib2.Request(url, headers=i_headers)
          return urllib2.urlopen(req).read()
    

    Extract the phonetic symbol and store it in dictionary:

      url='http://www.macmillandictionary.com/us/dictionary/american/'+wordlist[i+j][0:-1];
      html_text=get_url_content(url)
      pron_str=re.findall(r'</span>\D*<span class="SEP" context="PRON-after"',html_text)
      
    
  • Write JavaScript program to generate a given passage’s phonetic symbols and “stress & light”.

    1. find the word in the dictionary
    2. if it cannot be found in the dictionary, adjust the word to get its original form:
  • End up with -ing

Adjustment on word Adjustment on phonetic symbols
remove -ing add ‘ɪŋ’ at the end
remove -ing and add -e add ‘ɪŋ’ at the end
remove -ing and the preceding consonant add ‘ɪŋ’ at the end
remove -ying and add -ie add ‘ɪŋ’ at the end
  • End up with -s:
Adjustment on word Adjustment on phonetic symbols
remove -s if the end letter of the origin word is ‘e’ then add ‘iz’ at the end; else if the end letter of the phonetic symbol is voiceless consonant(ptkfθsw∫rh) add ‘s’ at the end; else add ‘z’ at the end
remove -es the same as above
remove -ies and add -y the same as above
remove -ves and add -f add ‘vz’ at the end
remove -ves and add -fe add ‘vz’ at the end
  • End up with -ed
Adjustment on word Adjustment on phonetic symbols
remove -d if the end letter of the phonetic symbol is ‘t’ or ‘d’, add ‘ɪd’ at the end; else if the end letter of the phonetic symbol is voiceless consonant(ptkfθsw∫rh) add ‘t’ at the end; else add ‘d’ at the end
remove -ed the same as above
remove -ied and add -y the same as above
remove -ed and the preceding consonant the same as above
  • End up with -er:
Adjustment on word Adjustment on phonetic symbols
remove -r add ‘ər’ at the end
remove -er the same as above
remove -ier and add -y the same as above
remove -er and the preceding consonant the same as above
  • End up with -est:

Adjustment on word|Adjustment on phonetic symbols
—|—
remove -r|add ‘ɪst’ at the end
remove -er| the same as above
remove -ier and add -y|the same as above
remove -er and the preceding consonant|the same as above