server : handle models with missing EOS token (#8997)
server : fix segfault on long system prompt (#8987)
* server : fix segfault on long system prompt
* server : fix parallel generation with very small batch sizes
* server : fix typo in comment
server : init stop and error fields of the result struct (#9026)
server : fix duplicated n_predict key in the generation_settings (#8994)
server : support reading arguments from environment variables (#9105)
* server : support reading arguments from environment variables
* add -fa and -dt
* readme : specify non-arg env var
server : add some missing env variables (#9116)
* server : add some missing env variables
* add LLAMA_ARG_HOST to server dockerfile
* also add LLAMA_ARG_CONT_BATCHING
Credits are to the respective authors.
Not a single merge conflict occurred.
Compiled, then tested without bug.