-
Previous Value Current Value Open
Closed
-
Checked and OneDev is using large buffer for all file I/O operations. It can be slow as cache upload is not simply a file copy, it also needs to go through web server.
-
Have you also checked any library code or JDK code involved? The speed we see basically matches what we get if we use
ddwith 8kb buffer. If OneDev uses large buffers that is great but if the underlying code that actually sends the data over the wire uses buffers itself and these buffers are small, then the large OneDev buffer is basically useless in terms of performance. -
Our web proxy in front of OneDev does not use proxy buffers. The proxy should be irrelevant because the Agents sends the data to OneDev and then OneDev uses its own buffers to write it do disk. If a OneDev buffer isn't full it should wait writing. Writing to disk then triggers NFS writes which are slow.
-
Checked all the paths for cache upload and am sure that 64k buffer is used.
-
@robin I have taken some time now to debug OneDev server locally because some of our builds now actually start to timeout after 1 hour because uploading cache is so slow.
You are using Apache commons class/method
IOUtils.copy(InputStream, OutputStream, int)in various places and as you said in all those places you are calling the method using a buffer size of 64kb. That buffer is used to read data from theInputStreamand then to write that data toOutputStream. However it is not guaranteed that the 64kb buffer actually fills up becauseInputStream.read(buffer)returns the number of bytes that have been read and that amount of bytes is then written to theOutputStream.I did set a breakpoint in
IOUtils.copyLarge(InputStream, OutputStream, byte[])and started a build on an agent (within docker) that also uploads cache to the server. It turns out that theInputStreamonly allows reading 4kb of data for eachread(buffer)call. TheInputStreamis provided by Jersey asEntityInputStreamwhich uses aHttpInputOverHTTP-Stream of Jetty at some point. This Jetty stream uses an 8kbByteBufferwhich has a limit set to 4KB for some reason, thus you can only read 4kb blocks from HTTP API calls received by Jetty.In case of cache uploading in
DefaultJobCacheManger.uploadCache(Long, Long, List<String>, InputStream)you create aFilterOutputStream(FileOutputStream)chain that is passed on toIOUtils.copy(). Because theFileOutputStreamis not wrapped withBufferedOutputStreamthe cache will be written out to disk in 4kb blocks which greatly hurts NFS performance as each write equals a network request to the NFS server.Because
IOUtils.copy()is used in various places you either have to configure Jetty somehow to allow reading more data at once from its HTTP connection or you must wrap yourOutputStreamwith aBufferedOutputStreamto be sure that you actually write out 64kb blocks when reading data from your HTTP API and transferring that data to anOutputStream. -
Thanks for the info. I filed a task to investigate this when I am free. OD-2204
-
Build OD-5768 uses BufferedOutputStream with 64k buffer size for all operations writing large file
-
Thanks. I just tried the new build and uploading ~3 GB cache changed from roughly 2 MB/s to roughly 10 MB/s which already improves build time quite a bit. Uploading the cache now takes ~6 minutes (down from ~20 minutes). Downloading the cache takes ~1 minute (even before your change here).
It seems there is still something limiting performance in the chain tar -> gz -> jetty outputstream -> 1gbit network -> jetty inputstream -> bufferedoutputstream -> fileoutputstream.
I will try to investigate further if I find time for it. Maybe tar -> gz of many small files on the agent is generally slow on our hardware/setup (VM, Docker).
| Type |
Improvement
|
| Priority |
Major
|
| Assignee | |
| Labels |
No labels
|
I use OneDev as docker image and storage volume for OneDev is mounted as NFS volume by docker. Docker itself sets a reasonable large NFS
rsizeandwsizemount options to increase NFS performance by reducing overhead. OneDev agents use local volumes.However when OneDev Agents uploads the build cache (about 1-2GB) to OneDev server it takes very long. Manually testing NFS share using
ddcommand with different block sizes shows that the block size should not be too small. With a block size of about 64kb NFS performance was ok.I have seen in OneDev code that OneDev uses similar sized byte buffers but maybe there is (unconfigured) JDK or 3rd party code involved that drops down back to 8kb buffer size which is often the default size in JDK / library code.
Any chance you can check that by looking through the code and also any JDK/library code that is involved in build cache tar/untar/send/receive actions? I feel like OneDev effectively only uses 8kb buffers when working with build cache. Maybe git checkout and writing files to disk might also worth checking.
Just to give you an impression: Cache upload of roughly 2 GB took 12 minutes which equals about 2.8 MB/s. However the NFS server is very well capable of receiving data with 60-80 MB/s when using
dd. Given that the build cache is one large tar archive OneDev should produce nearly the same performance.