Weekly Tech News

Category Tech News
Last Updated February 27, 2022

OPEN AI

Open AI has released a new blog post on "Aligning Language Models to Follow Instructions", where they fined tunes GPT-3 to following instructions given by humans. With this, GPT-3 model can be trained in a questions/answer fashion to later autogenerate answers. This is "just" the fine-tuning of their previous model but interestingly, it opens new research questions which are yet to be answered. Visit their official blog here to learn more about it.

Meta AI

Meta AI writes that they are releasing a series of multi-lingual autoregressive language models with up-to 7.5 billion parameters which significantly outperform english-centric language models in few-shot learning on 20 plus languages. Check-out details on their recent tweet.

Google's new Language Dialog Model

Google release a paper called "LaMDA: Language Models for Dialog Applications" along with a blog post where they streamline details on dialog models via large language models. The model is fine-tuned on dialog data with the use of various evaluation metrics like safety, sensitivity, specificity, interestingness, etc. Model can also do factual grounding as per which it will first look up wikimedia before it answers something. Check-out their blog and research paper here.

EvoGym

Evolution gym (EvoGym) is a large scale benchmark for evaluating soft robots. Contrary to classical reinforcement learning where optimal control is well studied, less attention is placed on finding the optimal robot design. EvoGym tries to close this gap where each robot can be composed of different types of voxels (e.g., soft, rigid, actuators), hence enabling a more modular and expressive robot design. The benchmark environments supports a wide range of tasks, including locomotion on various types of terrains and manipulation. For more details, check out their Getting-Started tutorials and their GitHub code here.

Stable-baselines3 Integrated to the Hugging Face Hub 🤗

Stable-baselines3 is a library that provides baseline implementation of reinforcement learning algorithms such as Q-learning, proximal policy optimization,etc. But now they are now on hugging-face hub which enables you to now host your saved models and load more powerful models from the community. More on this on their posts here and here.

Transformer Reinforcement Learning (TRL)

Here is a worth mentioning library on GitHub regarding training transformers using reinforcement learning approaches such as proximal-policy optimization.

Weekly Tech News – Week #8