Close Menu
CatchTheBullCatchTheBull
  • Home
  • Crypto News
  • Bitcoin
  • Altcoin
  • Blockchain
  • Airdrops News
  • NFT News
What's Hot

Indiana Governor Signs Bill Allowing Bitcoin In State Retirement Plans

March 3, 2026

NVIDIA Brings CUDA Tile Programming to Julia with cuTile.jl Release

March 3, 2026

BNB Chain Rolls Out Production-Ready AI Agent Tools With Live On-Chain Capabilities

March 3, 2026
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
CatchTheBullCatchTheBull
  • Home
  • Crypto News
  • Bitcoin
  • Altcoin
  • Blockchain
  • Airdrops News
  • NFT News
CatchTheBullCatchTheBull
Blockchain

NVIDIA Brings CUDA Tile Programming to Julia with cuTile.jl Release

By WebDeskMarch 3, 20263 Mins Read
NVIDIA Brings CUDA Tile Programming to Julia with cuTile.jl Release
Share
Facebook Twitter LinkedIn Pinterest Email


James Ding
Mar 03, 2026 20:24

NVIDIA releases cuTile.jl, enabling Julia developers to write high-performance GPU kernels using tile-based programming with near-parity Python performance.





NVIDIA has extended its tile-based GPU programming model to Julia developers with the release of cuTile.jl, an open-source package that achieves up to 100% performance parity with its Python counterpart on compute-intensive workloads.

The package, developed in collaboration with JuliaGPU, represents the latest expansion of CUDA Tile—what NVIDIA has called the most significant addition to CUDA programming since the platform launched in 2006. While Python developers gained access to the tile-based model earlier this year, Julia’s scientific computing community can now tap into the same automatic hardware optimization.

Why Tile-Based Programming Matters

Traditional CUDA development forces programmers to manually manage threads, warps, and memory hierarchies. Tile-based programming flips this: developers describe operations on chunks of data, and the compiler handles hardware mapping automatically. This includes automatic access to Tensor Cores and Tensor Memory Accelerators—specialized hardware that previously required expert-level optimization.

The practical difference shows up in code complexity. A vector addition kernel in traditional CUDA.jl requires explicit thread indexing, bounds checking, and block configuration. The cuTile.jl equivalent reads more like standard array operations, with the compiler handling the low-level details.

Benchmark Results on Blackwell Hardware

Testing on an NVIDIA GeForce RTX 5080 (Blackwell architecture), cuTile.jl matched Python performance across core operations:

Vector addition hit 838 GB/s versus Python’s 843 GB/s (99% parity). Matrix multiplication reached 50.9 TFLOPS against Python’s 50.5 TFLOPS—actually slightly faster. Matrix transpose achieved 98% parity at 797 GB/s.

Batch matrix multiply showed the largest gap at 91% (43.0 vs 47.5 TFLOPS), while complex control-flow kernels like layer normalization and FFT still need optimization work.

Technical Implementation

cuTile.jl uses a custom Julia compiler that intercepts standard library calls—operations like sum, reshape, and basic arithmetic—and routes them to Tile IR operations. This produces the same bytecode format as cuTile Python, feeding into NVIDIA’s tileiras compiler for final GPU machine code generation.

The design deliberately mirrors Python’s API structure, making documentation and code examples portable between languages. But it embraces Julia conventions where appropriate: 1-based indexing, broadcast syntax with dots (.^, .-, ./), and native integration with CUDA.jl for array management.

Current Limitations

This remains experimental software. Not all cuTile features work yet. Iterator-based for loops either fail or generate inefficient code. APIs may change without warning. The package requires Blackwell GPUs (compute capability 12.0+) and CUDA 13 drivers—hardware that most developers don’t have access to yet.

For Julia shops already invested in GPU computing through CUDA.jl, cuTile.jl offers a path toward simpler kernel development as Blackwell hardware becomes available. The package is available now through Julia’s package manager at github.com/JuliaGPU/cuTile.jl.

Image source: Shutterstock


Credit: Source link

Previous ArticleBNB Chain Rolls Out Production-Ready AI Agent Tools With Live On-Chain Capabilities
Next Article Indiana Governor Signs Bill Allowing Bitcoin In State Retirement Plans

Related Posts

RWA Perpetuals Hit $15B Daily Volume as DeFi Builds Wall Street Bridge

March 3, 2026

Filecoin (FIL) ProPGF Batch 2 Awards $3.22M to 16 Infrastructure Projects

March 3, 2026

NVIDIA Commits $4B to Optics Partners Lumentum, Coherent for AI Data Centers

March 3, 2026
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Indiana Governor Signs Bill Allowing Bitcoin In State Retirement Plans

March 3, 2026

NVIDIA Brings CUDA Tile Programming to Julia with cuTile.jl Release

March 3, 2026

BNB Chain Rolls Out Production-Ready AI Agent Tools With Live On-Chain Capabilities

March 3, 2026

Subscribe to Updates

Get the latest Crypto, Blockchain and Airdrop News from us to Catch The Bull.

Advertisement Banner

Welcome to CatchTheBull, your trusted source for the latest Crypto News and Airdrops. We bring you real-time updates, expert insights, and opportunities to stay ahead in the crypto world. Discover trending projects, market analyses, and airdrop details all in one place.

Join us on this journey to navigate the ever-evolving blockchain universe!

Facebook X (Twitter) Instagram YouTube
Top Insights

XRP Ledger Security Debate Intensifies After BatchGate Scare

2 Reasons Shiba Inu Could Rally First When Crypto Recovers

Uniswap Case Highlights Importance of Decentralization

Get Informed

Subscribe to Updates

Get the latest Crypto, Blockchain and Airdrop News from us to Catch The Bull.

© 2026 CatchTheBull. All Rights Are Reserved.
  • Contact Us
  • Privacy Policy
  • Terms of Use
  • DMCA

Type above and press Enter to search. Press Esc to cancel.

  • bitcoinBitcoin(BTC)$68,293.00-1.53%
  • ethereumEthereum(ETH)$1,981.14-3.60%
  • tetherTether(USDT)$1.00-0.01%
  • binancecoinBNB(BNB)$633.48-0.99%
  • rippleXRP(XRP)$1.36-2.92%
  • usd-coinUSDC(USDC)$1.000.01%
  • solanaSolana(SOL)$85.96-2.36%
  • tronTRON(TRX)$0.280699-0.88%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.030.26%
  • dogecoinDogecoin(DOGE)$0.090234-4.87%
  • whitebitWhiteBIT Coin(WBT)$49.43-2.16%
  • USDSUSDS(USDS)$1.00-0.02%
  • cardanoCardano(ADA)$0.263603-6.24%
  • bitcoin-cashBitcoin Cash(BCH)$443.75-1.29%
  • leo-tokenLEO Token(LEO)$9.211.70%
  • HyperliquidHyperliquid(HYPE)$31.31-5.56%
  • moneroMonero(XMR)$340.96-1.88%
  • chainlinkChainlink(LINK)$8.80-3.09%
  • Ethena USDeEthena USDe(USDE)$1.000.01%
  • CantonCanton(CC)$0.157468-1.17%
  • stellarStellar(XLM)$0.150950-4.12%
  • USD1USD1(USD1)$1.00-0.03%
  • RainRain(RAIN)$0.009154-0.94%
  • daiDai(DAI)$1.00-0.02%
  • litecoinLitecoin(LTC)$54.79-0.97%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.03%
  • hedera-hashgraphHedera(HBAR)$0.096908-2.25%
  • avalanche-2Avalanche(AVAX)$9.17-1.02%
  • zcashZcash(ZEC)$218.91-2.62%
  • suiSui(SUI)$0.91-2.97%
  • shiba-inuShiba Inu(SHIB)$0.000006-1.44%
  • crypto-com-chainCronos(CRO)$0.075418-0.89%
  • the-open-networkToncoin(TON)$1.251.13%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.103981-4.68%
  • tether-goldTether Gold(XAUT)$5,076.57-4.08%
  • polkadotPolkadot(DOT)$1.52-0.15%
  • pax-goldPAX Gold(PAXG)$5,118.30-4.16%
  • uniswapUniswap(UNI)$3.89-1.76%
  • MemeCoreMemeCore(M)$1.38-9.49%
  • mantleMantle(MNT)$0.695.29%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • Circle USYCCircle USYC(USYC)$1.120.04%
  • Global DollarGlobal Dollar(USDG)$1.000.01%
  • BittensorBittensor(TAO)$183.90-3.36%
  • nearNEAR Protocol(NEAR)$1.35-3.48%
  • Falcon USDFalcon USD(USDF)$1.000.07%
  • AsterAster(ASTER)$0.69-0.42%
  • aaveAave(AAVE)$111.04-10.87%
  • SkySky(SKY)$0.0713281.24%
  • Pi NetworkPi Network(PI)$0.169441-0.34%