Many organizations have domain-specific knowledge-base related to each of its departments, e. g., human resources, financial risk, … etc.In this project, students will develop a chatbot that facilitates the retrieval of specific information within the knowledge base through question-answers dialogues. That is, the chatbots ‘understands’ input questions and provides relevant answers which triggers more questions from the user.
Students will learn how to build a chatbot from scratch. This will involve techniques for text processing such as text cleaning, statistical language modelling, and vocabulary creation. They will develop two types of chatbots: rule-based chatbots and generative, aka AI, chatbots. In this regard, students will start by learning text processing techniques such as tokenization, word stemming and lemmatization and then they will learn about BNF grammar and rule-base matching. In addition, they will be introduced to common feature engineering techniques such as term frequency document inverse frequency (TFIDF) and latent semantic indexing (LSI) as well as more advanced methods such as Word2Vec and Doc2Vec. More importantly, students will learn how to build, design, train, validate, and test neural language models such as RNN/LSTM. Finally, particular attention will be focused on Seq2seq models which are used to encode input questions and output decoded answers using an RNN/LSTM-based language generating model.
Python 3.7 is the main programming language used in the course. Throughout this project, students will be using a variety of important python packages such as NLTK and spacy for text processing, Keras and tensorflow for designing and training neural networks, and common packages such as numpy, scipy, sci-kit learn, and pandas.
POTENTIAL EMPLOYMENT OPPORTUNITIES