Read-only root filesystem — the container itself is immutable
One challenge is having enough training data. Another is that the training data needs to be free of contamination. For a model trained up till 1900, there needs to be no information from after 1900 that leaks into the data. Some metadata might have that kind of leakage. While it’s not possible to have zero leakage - there’s a shadow of the future on past data because what we store is a function of what we care about - it’s possible to have a very low level of leakage, sufficient for this to be interesting.
,更多细节参见heLLoword翻译官方下载
Continue reading...,更多细节参见夫子
d=4 now works with rank-3 factorization + grokking (311 params trained),推荐阅读下载安装汽水音乐获取更多信息