For years, co-founder and chief executive officer Jensen Huang and other higher-ups at Nvidia have been banging on the message that the company is more than its GPUs, that the chips that have become ...
Google says its new TurboQuant method could improve how efficiently AI models run by compressing the key-value cache used in LLM inference and supporting more efficient vector search. In tests on ...
Nvidia CEO Jensen Huang debuted a new AI inference system during his GTC conference keynote. The product incorporates technology from Groq, with which Nvidia made a $20 billion deal. The chip can ...
Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources. Dany Lepage discusses the architectural ...
Nvidia Corp. is reportedly working on a dedicated inference processor that will be used by OpenAI Group PBC and other artificial intelligence companies to develop faster and more efficient models, ...
Inference will take over for training as the primary AI compute moving forward. Broadcom has struck gold with its custom ASICs for AI hyperscalers. Arm Holdings should benefit immensely as inference ...
SAN FRANCISCO, Feb 2 (Reuters) - OpenAI is unsatisfied with some of Nvidia’s latest artificial intelligence chips, and it has sought alternatives since last year, eight sources familiar with the ...
Microsoft has announced the launch of its latest chip, the Maia 200, which the company describes as a silicon workhorse designed for scaling AI inference. The 200, which follows the company’s Maia 100 ...
Artificial intelligence inference startup Baseten Labs Inc. has raised $300 million in new funding on a $5 billion valuation. The round was co-led by Institutional Venture Partners LP and CapitalG LP, ...
Google researchers have warned that large language model (LLM) inference is hitting a wall amid fundamental problems with memory and networking problems, not compute. In a paper authored by ...
Nvidia’s $20 billion strategic licensing deal with Groq represents one of the first clear moves in a four-front fight over the future AI stack. 2026 is when that fight becomes obvious to enterprise ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results