Currently, we are looking for a Data Scientist / Machine Learning Engineer who will be a part of the general-purpose data science core team and work with tasks covering a wide variety of business needs with a soft focus on NLP or CV applications.
In this position, you will work with multiple data sources (usually textual, numerical and time-related data), huge and small datasets to develop, validate and deploy machine learning models, tune their performance & integrate them into data processing pipelines.
Responsibilities:
Deal with both structured and unstructured data, collaborate with data engineers on defining data storage formats, state data collection requirements;Not only solve technical tasks but understand business needs and offer appropriate solutions, describe a chosen approach to non-technical people;Set up reproducible experiments: selection, training, validation and optimization of machine learning models, evaluation of their quality in business-related terms;Integrate data preprocessing and model inference into general data processing pipelines;Research new tools, papers, etc. in the machine learning area.Requirements:
Strong knowledge and deep understanding of?lassical machine learning (linear models, decision trees, ensembles for classification and regression tasks, clustering and dimensionality reduction)Main concepts and stages of the modelling process (validation scheme, regularization, overfitting and generalization, data leaks, feature selection, etc.)Experience with Python scientific, visualization and ML-related libraries (numpy, scipy, scikit-learn, etc.)Experience with different clustering techniquesExperience with classic NLP tools and techniques (nltk, spacy, n-grams, skip-grams, TF-IDF, tokenizers, lemmatization, dependencies parsing, etc.)Experience with NN frameworks, NLP-related architectures and libraries (Pytorch / Tensorflow, HuggingFace, fasttext, flair, sentence transformersWord2Vec, ElMo, RNN, CNN, Transformer, BERT, etc.)Experience in tuning pre-trained models for different NLP tasksGood Python programming skillsGood spoken and written English (at least B1)Ability and desire to convert raw business requests into strictly formulated machine learning tasksAbility to formulate data gathering (or data labelling) requirementsMinimum 2-year experience in machine learning