Yunhai Hu is a computer science Masters student at NYU and a Research Collaborator at Yale, specializing in efficient Large Language Model inference. He has industry experience as a Machine Learning Engineer at bilibili and has co-authored multiple publications on AI model optimization.
He has co-authored multiple research papers on "Speculative Decoding, " a cutting-edge technique to accelerate the performance of large-scale autoregressive models.