Why Human Evaluation is the Missing Piece in World Model Development | OWL Blog