HPCC Systems: Chinese Dictionary

Building a Chinese Dictionary for doing NLP

August 4, 2022

Today is the Final Evaluation day with Lorraine and David. We went over the expectations of the program and how the mentor/mentee felt about each other and the progress we made. We also went over the poster contest and my poster abstract. I have uploaded my parsers to Github that future interns may refer to.

Instructions for using Github scripts

On https://github.com/VisualText/dict-zh I have posted the 2 scripts I used to parse Wiktionary and JSON files into individual Chinese files. Here are instructions to use them. Breaker:1. Download XML file from Wiktionary dump to the same directory as breaker.py: https://dumps.wikimedia.org/enwiktionary/latest/ (this is for en wiktionary)2. Change line 4 of breaker to the file name3. CreateContinue reading “Instructions for using Github scripts”

Create your website with WordPress.com
Get started