MediaTek chip with Meta’s Llama 2 for on-device generative AI coming soon
Taiwanese chipmaker MediaTek on Thursday announced it is working with Meta’s open-source large language model Llama 2 to build an edge computing ecosystem to accelerate artificial intelligence application development on smartphones, IoT device, vehicles, and other edge devices. MediaTek said it expects Llama 2-based AI applications to become available for smartphones powered by its next-generation flagship system-on-chip, which is scheduled to hit the market by the end of the year.
“Presently, most Generative AI processing is performed through cloud computing; however, MediaTek’s use of Llama 2 models will enable generative AI applications to run directly on-device as well. Doing so provides several advantages to developers and users, including seamless performance, greater privacy, better security and reliability, lower latency, the ability to work in areas with little to no connectivity, and lower operation cost,” said MediaTek.
MediaTek said its next-generation flagship chipset, to be introduced later this year, will feature a software stack optimised to run Llama 2. Besides, the chip would boast advancements that would expediate pace for building use cases for on-device Generative AI. These features include upgraded integrated APU with transformer backbone acceleration, reduced footprint access, and use of DRAM bandwidth.
“The increasing popularity of Generative AI is a significant trend in digital transformation, and our vision is to provide the exciting community of Llama 2 developers and users with the tools needed to fully innovate in the AI space,” said JC Hsu, corporate senior vice president and general manager of wireless communications business unit at MediaTek.
Meta announced the Llama 2 in partnership with Microsoft in July. The large language model trained for generative AI is free for use for research and commercial use, making it one of the largest open source LLMs. Meta said Llama 2 was trained on 40 per cent more data (two trillion tokens) than Llama 1, and it offers double the context length. Its fine-tuned models have been trained on over 1 million human annotations.