A simple operation for the 7B model might require loading all 14GB of parameters. At 600 GB/s memory bandwidth, just loading all parameters once would take: 14GB ÷ 600GB/s = 0.023 seconds. This translates to a theoretical maximum of ~43 operations per second involving all parameters