SERVERS

d3sign/Getty Images

Generative AI, one of the hottest growing technologies, is used by OpenAI's ChatGPT and Google Bard for chat and by image generation systems such as Stable Diffusion and DALL-E. Still, it has certain limitations because these tools require the use of cloud-based data centers with hundreds of GPUs to perform the computing processes needed for every query.

But one day you could run generative AI tasks directly on your mobile device. Or your connected car. Or in your living room, bedroom, and kitchen on smart speakers like Amazon Echo, Google Home, or Apple HomePod.

Also: Your next phone will be able to run generative AI tools (even in Airplane Mode)

MediaTek believes this future is closer than we realize. Today, the Taiwan-based semiconductor company announced that it is working with Meta to port the social giant's Lllama 2 LLM -- in combination with the company's latest-generation APUs and NeuroPilot software development platform -- to run generative AI tasks on devices without relying on external processing.

Of course, there's a catch: This won't eliminate the data center entirely. Due to the size of LLM datasets (the number of parameters they contain) and the storage system's required performance, you still need a data center, albeit a much smaller one.

For example, Llama 2's "small" dataset is 7 billion parameters, or about 13GB, which is suitable for some rudimentary generative AI functions. However, a much larger version of 72 billion parameters requires a lot more storage proportionally, even using advanced data compression, which is outside the practical capabilities of today's smartphones. Over the next several years, LLMs in development will easily be 10 to 100 times the size of Llama 2 or GPT-4, with storage requirements in the hundreds of gigabytes and higher.

That's hard for a smartphone to store and have enough IOPS for database performance, but certainly not for specially designed cache appliances with fast flash storage and terabytes of RAM. So, for Llama 2, it is possible today to host a device optimized for serving mobile devices in a single rack unit without all the heavy compute. It's not a phone, but it's pretty impressive anyway!

Also: The best AI chatbots of 2023: ChatGPT and alternatives

MediaTek expects Llama 2-based AI applications to become available for smartphones powered by their next-generation flagship SoC, scheduled to hit the market by the end of the year.

For on-device generative AI to access these datasets, mobile carriers would have to rely on low-latency edge networks -- small data centers/equipment closets with fast connections to the 5G towers. These data centers would reside directly on the carrier's network, so LLMs running on smartphones would not need to go through many network "hops" before accessing the parameter data.

In addition to running AI workloads on device using specialized processors such as MediaTek's, domain-specific LLMs can be moved closer to the application workload by running in a hybrid fashion with these caching appliances within the miniature datacenter -- in a "constrained device edge" scenario.

Also: These are my 5 favorite AI tools for work

So, what are the benefits of using on-device generative AI?

Reduced latency: Because the data is being processed on the device itself, the response time is reduced significantly, especially if localized cache methodologies are used by frequently accessed parts of the parameter dataset.
Improved data privacy: By keeping the data on the device, that data (such as a chat conversation or training submitted by the user) isn't transmitted through the data center; only the model data is.
Improved bandwidth efficiency: Today, generative AI tasks require all data from the user conversation to go back and forth to the data center. With localized processing, a large amount of this occurs on the device.
Increased operational resiliency: With on-device generation, the system can continue functioning even if the network is disrupted, particularly if the device has a large enough parameter cache.
Energy efficiency:It doesn't require as many compute-intensive resources at the data center, or as much energy to transmit that data from the device to the data center.

However, achieving these benefits may involve splitting workloads and using other load-balancing techniques to alleviate centralized data center compute costs and network overhead.

In addition to the continued need for a fast-connected edge data center (albeit one with vastly reduced computational and energy requirements), there's another issue: Just how powerful an LLM can you really run on today's hardware? And while there is less concern about on-device data being intercepted across a network, there is the added security risk of sensitive data being penetrated on the local device if it isn't properly managed -- as well as the challenge of updating the model data and maintaining data consistency on a large number of distributed edge caching devices.

Also: How edge-to-cloud is driving the next stage of digital transformation

And finally, there is the cost: Who will foot the bill for all these mini edge datacenters? Edge networking is employed today by Edge Service Providers (such as Equinix), which is needed by services such as Netflix and Apple's iTunes, traditionally not mobile network operators such as AT&T, T-Mobile, or Verizon. Generative AI services providers such as OpenAI/Microsoft, Google, and Meta would need to work out similar arrangements.

There are a lot of considerations with on-device generative AI, but it's clear that tech companies are thinking about it. Within five years, your on-device intelligent assistant could be thinking all by itself. Ready for AI in your pocket? It's coming -- and far sooner than most people ever expected.

Artificial Intelligence

Generative AI will far surpass what ChatGPT can do. Here's everything on how the tech advancesChatGPT's new web browsing feature is a big disappointment. Use this plugin insteadWhat is Amazon Bedrock? 4 ways it can help businesses use generative AI toolsCan generative AI solve computer science's greatest unsolved problem?

Generative AI will far surpass what ChatGPT can do. Here's everything on how the tech advances
ChatGPT's new web browsing feature is a big disappointment. Use this plugin instead
What is Amazon Bedrock? 4 ways it can help businesses use generative AI tools
Can generative AI solve computer science's greatest unsolved problem?

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVERS

HOT NEWS

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

Introduction to Huawei CloudEngine S6730-H Series Switches

Comprehensive Guide to the CloudEngine S6730-H24X6C-V2: Features, Specifications, and Applications

Huawei S6730-S24X6Q: Advanced Ethernet Switch for Modern Networks

Comprehensive Guide to the S6730-H48X6C-V2 High-Performance Switch

Huawei CloudEngine S6730-H28Y4C: High-Performance Switch for Modern Networks

Overview of the S6730-H24X6C-V2

Unveiling the Huawei CloudEngine S6730 Series: Advanced Switching for Modern Networks

Huawei S6730-H48X6C: A Comprehensive Overview

Could you soon be running AI tasks right on your smartphone? MediaTek says yes

Artificial Intelligence

Hot Tags : Artificial Intelligence Innovation

Ordering Guide

Resources

About Us

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVERS

HOT NEWS

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Difference between campus switch and data center switch

Huawei S6730-H28Y4C Campus CloudEngine Switch Datasheet

S6730-H48Y6C: Unleashing Power and Flexibility for Modern Networking

CloudEngine S6730-H Series Switches Datasheet

Huawei CloudEngine Switch S6730-S24X6Q Datasheet

CloudEngine S6700 Series Switches Naming Conventions & Description

Huawei CloudEngine S6730-H24X6C Datasheet

Huawei S6730 Series Switches Datasheet

Huawei CloudEngine Switch S6730-H48X6C Datasheet

Introduction to the Huawei CloudEngine S6730-S Series Switches

Huawei S6730-H48X6CZ-V2: The Ultimate High-Speed Network Switch

Overview of the S6730-H28X6CZ-V2 Switch

Huawei CloudEngine S6730-H24X4Y4C: A High-Performance Enterprise Switch for Modern Networks

​Introduction to Huawei CloudEngine S6730-H Series Switches

Comprehensive Guide to the CloudEngine S6730-H24X6C-V2: Features, Specifications, and Applications

Huawei S6730-S24X6Q: Advanced Ethernet Switch for Modern Networks

Comprehensive Guide to the S6730-H48X6C-V2 High-Performance Switch

Huawei CloudEngine S6730-H28Y4C: High-Performance Switch for Modern Networks

Overview of the S6730-H24X6C-V2

Unveiling the Huawei CloudEngine S6730 Series: Advanced Switching for Modern Networks

Huawei S6730-H48X6C: A Comprehensive Overview

Could you soon be running AI tasks right on your smartphone? MediaTek says yes

Artificial Intelligence

Hot Tags : Artificial Intelligence Innovation

Ordering Guide

Resources

About Us

Introduction to Huawei CloudEngine S6730-H Series Switches