R or python project

Cerrado Publicado hace 5 años Pagado a la entrega
Cerrado Pagado a la entrega

SUBJECT / AREA: "Azerbaijan Old Literature"

You are expected to create CORPORA of text in Azebaijan language that belong to category specified in SUBJECT.

REQUIREMENTS:

- Only Azerbaijani text is allowed (any text in other language must be excluded);

- All final text must be in textual file;

- Sentences shorter than 3 words must be excluded;

- Only complete sentences should be used;

- Poem/Poetry is not allowed;

- Each sentence must start with Letter (first symbol can't be number, or any other symbol like "-, _, (, ), ..." etc.);

- Format is one sentence per line - each sentence must start from new line and end with ".";

- Broken sentences (when sentence has EOF in middle) are not allowed;

- Only Single space between all words;

- All page-numbers, headers, titles, etc. must be excluded - just senten

DELIVERABLES:

1) Final Textual file (.TXT) with all sentences

2) List of book-title used as source

3) Source files (.PDF, .DOC electronic books) where text extracted from

TEXT SOURCE: Its totally your responsibility to find publicly available text source (.PDF, .DOC, etc.)

Example of source: [login to view URL]

Python Lenguaje de Programación R Investigación

Nº del proyecto: #16941470

Sobre el proyecto

Proyecto remoto Activo hace 5 años