Standby, or Stand Firm? (Part 2)

Oct 7

That they are endowed with certain unalienable rights, That among these are life, liberty, and the purchase of firm power.

Read →

7 Comments

Stephen Fossey

Oct 10Edited

There might be additional benefits to Data Center operators from adjusting chip clocking. Over-clocking a GPU is generally thought to decrease lifetime. For example: https://techreviewadvisor.com/how-overclocking-affects-your-gpus-lifespan/

Expand full comment

Reply (3)

Daniel King

Oct 11

Another thought, Stephen:

Is there an underlying claim that wear and tear gets *so bad* that the chip will soon have to reduce voltage or clock frequency *anyway* (because transistor precision has been compromised)?

Expand full comment

Daniel King

Oct 11

This is an excellent point, with at least a few implications or corollaries to keep in mind.

1) It can be perfectly rational to sacrifice the “wall clock” lifespan of a GPU if that means getting more FLOPs out of it.

2) NVIDIA’s tech improves so fast that by the tail end of a chip’s life, it may be preferable simply to use the new product.

3) A business advantage goes to the compute provider who’s willing to serve first and serve up those workloads fast, so I’ll be curious to see the lengths to which DCs go, even if/when that means wearing GPUs harder.

Expand full comment

Stephen Fossey

Oct 11

That question is beyond my expertise. I just know high temperatures are bad for lifetimes and temperature cycling is bad because in general there’s a coefficient of thermal expansion mismatch between the chips and everything else like heat sinks.

Expand full comment