langalogo
Convo AI
Autonomous Agents
Model Optimization
NLP
Neuroscience

The latest developments in conversational artificial intelligence.

Textbooks Are All You Need II: phi-1.5 technical report

Textbooks Are All You Need II: phi-1.5 technical report

The paper aims to investigate the capabilities of smaller Transformer-based language models and determine if high performance can be achieved without the need for both 1) large-scale models, and 2) large-scale data.


September 22nd 2023, 3:46 PM
Evan Goodwin
NLP
Model Optimization

Objective

The paper aims to investigate the capabilities of smaller Transformer-based language models and determine if high performance can be achieved without the need for both 1) large-scale models, and 2) large-scale data.

Central Problem

LLMs have shown transformative capabilities in the field of NLP. However, their vast size poses challenges in terms of training costs, energy consumption, and controllability.

Solution & Methodology

The authors Introduce phi-1.5, a 1.3 billion parameter model, focusing on common sense reasoning in natural language. The model aims to achieve performance comparable to models 5~10x larger, using a unique training approach that leverages "textbook-like" synthetic data. Specifically, the training data consists of 7B tokens from phi-1’s training data, synthetically generated 20B tokens. The model is trained from scratch, and is used without any instruction fine-tuning or RLHF.

Results


Related Articles

Copyright © Convo AI 2023.

Terms