Posts

Showing posts with the label language

Integrating Unicode CLDR: Adding Locale Support

So far translation data is ready to be used for translation of date strings. This includes combined data from unicode CLDR and supplementary data contributed by many individuals. Initially the data from unicode CLDR was stored as json files and supplementary data was stored as yaml files and data from both these sources were to be combined after loading each separately which was then supposed to be used for translating dates. But after discussion with my mentor we agreed upon a better approach suggested by him that we could store data directly in python modules. Storing data in python modules directly has the advantage that importing data from python modules is faster than loading from json and yaml files. And considering the fact that we don't have to combine data from cldr and supplementary data files at run time, storing data in Python modules proves to be much more efficient. As the data is ready, now is the time to make necessary modifications in the codebase to support l...

Integrate Unicode CLDR: Working with Retrieved Data

I have been working on scripts and modifying data for quite some time and it seems the data is ready to be used for parsing dates. The data retrieved from Unicode CLDR  has been divided into two parts: numeral_translation_data , that will be specifically used to parse numerals, and date_translation_data ,   that will be used to parse date strings after modifying it by parsing numerals if included in the date string. The existing data that was contributed by various individuals to dateparser has been modified to supplement the data retrieved from CLDR, i.e., only the portion of data that is not included in data retrieved from CLDR remains as supplementary data , which will continue to be modified by contributors in future. The data for date translation consists of translations for months, weekdays and periods, date order, and translations for relative-type dates. This data is stored for different languages and for each language, locale-specific data is stored for ...