FLM-101B: Training 101 Billion Parameter Language Model with a $100K Budget

24 September 2023
FLM 101B evaluating growth strategy

FLM-101B: Training 101 Billion Parameter Language Model with a $100K Budget

Researchers from Beijing University present FLM-101B, an open-source large language model (LLM) with 101 billion parameters trained from scratch with a budget of only $100K. Training LLMs at large scales…

Würstchen: An Open-Source Text-to-Image Model Consuming 16 Times Less GPU than Stable Diffusion 1.4

14 September 2023
Würstchen approach

Würstchen: An Open-Source Text-to-Image Model Consuming 16 Times Less GPU than Stable Diffusion 1.4

Würstchen is an open text-to-image model that generates images faster than diffusion models like Stable Diffusion while consuming significantly less memory, achieving comparable results. The approach is based on a…

Persimmon-8B: An Open Model with a 16k Token Context, Running on a Single GPU

11 September 2023
persimmon-8b-llm

Persimmon-8B: An Open Model with a 16k Token Context, Running on a Single GPU

Researchers from Adept have introduced the open-source language model Persimmon-8B with a 16k token context, which is four times larger than the most compact Llama 2 and text-davinci-002 used in…

Falcon 180B: The Largest Open Language Model Surpasses Llama 2 and GPT 3.5

6 September 2023
falcon 180b model intro

Falcon 180B: The Largest Open Language Model Surpasses Llama 2 and GPT 3.5

The Institute of Technological Innovations from the UAE has unveiled Falcon 180B, the largest open language model, displacing Llama 2 from the top spot in the rankings of pre-trained open-access…

GigaGAN: Open Source Model Generates 512px Images in Just 0.13 Seconds

1 September 2023
GIGAGAN

GigaGAN: Open Source Model Generates 512px Images in Just 0.13 Seconds

GigaGAN – an open source model with 1 billion parameters, can generate 512×512 pixel images in just 0.13 seconds, significantly faster than diffuse and autoregressive models. Additionally, researchers have developed…

Code Llama: State-of-the-Art Code Creation Model

28 August 2023
code llama model

Code Llama: State-of-the-Art Code Creation Model

The Code Llama model is an enhanced version of Llama 2, designed for code generation, completion, and correction. It’s available for free for both commercial and research purposes. Code Llama…

Google VRDU: Advancing Document Content Understanding with Dataset and Benchmark

27 August 2023
google vrdu 2

Google VRDU: Advancing Document Content Understanding with Dataset and Benchmark

Google has publicly released VRDU, a dataset and benchmark designed for training models in understanding document content. VRDU aims to accelerate the development of models capable of processing complex documents…

Arthur Bench: Framework for Evaluating Language Models

20 August 2023
arthur bench

Arthur Bench: Framework for Evaluating Language Models

American startup Arthur has released an open-source framework called Bench for evaluating and comparing the performance of large language models. This tool enables users to select the most suitable language…

ReLoRA: Method for Enhancing Performance in Training Large Language Models

16 August 2023
relora method

ReLoRA: Method for Enhancing Performance in Training Large Language Models

ReLoRA is a technique for training large transformer-based language models using low-rank matrices, aimed at boosting training efficiency. The effectiveness of this method increases with the scale of the models.…

Audiocraft: Open Source Library for Music and Sound Generation

4 August 2023
audiocraft

Audiocraft: Open Source Library for Music and Sound Generation

Introducing Audiocraft – a PyTorch library with open-source code, designed for generating music and sound from text. It serves as a powerful tool for deep learning-based audio generation research. Within…

Stability AI Introduces Stable Diffusion SDXL 1.0 Model

26 July 2023
Stable Diffusion SDXL 1.0

Stability AI Introduces Stable Diffusion SDXL 1.0 Model

Stability AI has announced the release of Stable Diffusion SDXL 1.0, a new version of the popular image generation model. SDXL 1.0 is a foundational model with 3.5 billion parameters…

DragGAN: Open Source Model for Manipulating GAN-Generated Images

6 July 2023
dragyourgan

DragGAN: Open Source Model for Manipulating GAN-Generated Images

Researchers from the Max Planck Institute, MIT, and Google have introduced DragGAN, an innovative approach that allows for seamless manipulation of images generated using Generative Adversarial Networks (GANs). By leveraging…

MAGVIT: Open Source Generative Video Transformer 10-in-1

29 June 2023
MAGVIT

MAGVIT: Open Source Generative Video Transformer 10-in-1

Researchers from Carnegie Mellon University, Google Research, and the University of Georgia have introduced MAGVIT (Masked Generative Video Transformer), an open-source video generation model. MAGVIT is a unified model that…

PandasAI: Data Analysis with Language Models

25 June 2023
PandasAI framework

PandasAI: Data Analysis with Language Models

PandasAI is a library that allows performing basic data analysis through natural language queries. Users can specify one or multiple dataframes and a text query, and receive the output in…

MusicGen: Open Source Neural Network for Generating Music in Any Genre

13 June 2023
musicgen

MusicGen: Open Source Neural Network for Generating Music in Any Genre

MusicGen is a neural network that generates music based on textual descriptions and melody examples, providing more precise control over the generated output. Researchers conducted extensive empirical research to demonstrate…

Open-source model StarCoder generates code in 86 programming languages

10 May 2023
starcoder

Open-source model StarCoder generates code in 86 programming languages

StarCoder is a state-of-the-art method for code correction and generation using neural networks from the research community The BigCode, MIT, University of Pennsylvania, and Columbia University. StarCoder improves quality and…

7 Websites with Publicly Available Datasets

2 September 2021

7 Websites with Publicly Available Datasets

The article provides an overview of sites containing tens of thousands of datasets in the public domain. The datasets presented on these resources cover such areas as healthcare, geography, sociology,…

TextFlint: a library for analyzing NLP models robustness

8 April 2021

TextFlint: a library for analyzing NLP models robustness

TextFlint is a multilingual, multitasking platform for analyzing NLP models stability. Open source available for English and Chinese, other languages ​​are being developed. Included text processing tools: general and specific…

EmoPy – Python Emotion Recognition Toolkit From ThoughtWorks

30 December 2018
emopy

EmoPy – Python Emotion Recognition Toolkit From ThoughtWorks

ThoughtWorks, a global technology company working mainly on software development, has open-sourced a Python toolkit for emotion recognition – EmoPy. Created as part of ThoughtWorks Arts, a program which incubates…

“NLP Architect” by Intel: Taking Another Step Forward in Natural Language Processing

25 June 2018
NLPArchitect

“NLP Architect” by Intel: Taking Another Step Forward in Natural Language Processing

Chatbots are everywhere today. Have you noticed how more and more business websites have a chatbot popping up on their home page? And this is only one out of multiple…