Some people would say that two threads is too many - I'm not quite in that camp :-)
Here's my advice: measure, don't guess. One suggestion is to make it configurable and initially set it to 100, then release your software to the wild and monitor what happens.
If your thread usage peaks at 3, then 100 is too much. If it remains at 100 for most of the day, bump it up to 200 and see what happens.
You could actually have your code itself monitor usage and adjust the configuration for the next time it starts but that's probably overkill.
For clarification and elaboration:
I'm not advocating rolling your own thread pooling subsystem, by all means use the one you have. But, since you were asking about a good cut-off point for threads, I assume your thread pool implementation has the ability to limit the maximum number of threads created (which is a good thing).
I've written thread and database connection pooling code and they have the following features (which I believe are essential for performance):
- a minimum number of active threads.
- a maximum number of threads.
- shutting down threads that haven't been used for a while.
The first sets a baseline for minimum performance in terms of the thread pool client (this number of threads is always available for use). The second sets a restriction on resource usage by active threads. The third returns you to the baseline in quiet times so as to minimise resource use.
You need to balance the resource usage of having unused threads (A) against the resource usage of not having enough threads to do the work (B).
(A) is generally memory usage (stacks and so on) since a thread doing no work will not be using much of the CPU. (B) will generally be a delay in the processing of requests as they arrive as you need to wait for a thread to become available.
That's why you measure. As you state, the vast majority of your threads will be waiting for a response from the database so they won't be running. There are two factors that affect how many threads you should allow for.
The first is the number of DB connections available. This may be a hard limit unless you can increase it at the DBMS - I'm going to assume your DBMS can take an unlimited number of connections in this case (although you should ideally be measuring that as well).
Then, the number of threads you should have depend on your historical use. The minimum you should have running is the minimum number that you've ever had running + A%, with an absolute minimum of (for example, and make it configurable just like A) 5.
The maximum number of threads should be your historical maximum + B%.
You should also be monitoring for behaviour changes. If, for some reason, your usage goes to 100% of available for a significant time (so that it would affect the performance of clients), you should bump up the maximum allowed until it's once again B% higher.
In response to the "what exactly should I measure?" question:
What you should measure specifically is the maximum amount of threads in concurrent use (e.g., waiting on a return from the DB call) under load. Then add a safety factor of 10% for example (emphasised, since other posters seem to take my examples as fixed recommendations).
In addition, this should be done in the production environment for tuning. It's okay to get an estimate beforehand but you never know what production will throw your way (which is why all these things should be configurable at runtime). This is to catch a situation such as unexpected doubling of the client calls coming in.
Introduction
To browse and select a file for upload you need a HTML <input type="file">
field in the form. As stated in the HTML specification you have to use the POST
method and the enctype
attribute of the form has to be set to "multipart/form-data"
.
<form action="upload" method="post" enctype="multipart/form-data">
<input type="text" name="description" />
<input type="file" name="file" />
<input type="submit" />
</form>
After submitting such a form, the binary multipart form data is available in the request body in a different format than when the enctype
isn't set.
Before Servlet 3.0, the Servlet API didn't natively support multipart/form-data
. It supports only the default form enctype of application/x-www-form-urlencoded
. The request.getParameter()
and consorts would all return null
when using multipart form data. This is where the well known Apache Commons FileUpload came into the picture.
Don't manually parse it!
You can in theory parse the request body yourself based on ServletRequest#getInputStream()
. However, this is a precise and tedious work which requires precise knowledge of RFC2388. You shouldn't try to do this on your own or copypaste some homegrown library-less code found elsewhere on the Internet. Many online sources have failed hard in this, such as roseindia.net. See also uploading of pdf file. You should rather use a real library which is used (and implicitly tested!) by millions of users for years. Such a library has proven its robustness.
When you're already on Servlet 3.0 or newer, use native API
If you're using at least Servlet 3.0 (Tomcat 7, Jetty 9, JBoss AS 6, GlassFish 3, etc), then you can just use standard API provided HttpServletRequest#getPart()
to collect the individual multipart form data items (most Servlet 3.0 implementations actually use Apache Commons FileUpload under the covers for this!). Also, normal form fields are available by getParameter()
the usual way.
First annotate your servlet with @MultipartConfig
in order to let it recognize and support multipart/form-data
requests and thus get getPart()
to work:
@WebServlet("/upload")
@MultipartConfig
public class UploadServlet extends HttpServlet {
// ...
}
Then, implement its doPost()
as follows:
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
String description = request.getParameter("description"); // Retrieves <input type="text" name="description">
Part filePart = request.getPart("file"); // Retrieves <input type="file" name="file">
String fileName = Paths.get(filePart.getSubmittedFileName()).getFileName().toString(); // MSIE fix.
InputStream fileContent = filePart.getInputStream();
// ... (do your job here)
}
Note the Path#getFileName()
. This is a MSIE fix as to obtaining the file name. This browser incorrectly sends the full file path along the name instead of only the file name.
In case you want to upload multiple files via either multiple="true"
,
<input type="file" name="files" multiple="true" />
or the old-fashioned way with multiple inputs,
<input type="file" name="files" />
<input type="file" name="files" />
<input type="file" name="files" />
...
then you can collect them as below (unfortunately there is no such method as request.getParts("files")
):
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
// ...
List<Part> fileParts = request.getParts().stream().filter(part -> "files".equals(part.getName()) && part.getSize() > 0).collect(Collectors.toList()); // Retrieves <input type="file" name="files" multiple="true">
for (Part filePart : fileParts) {
String fileName = Paths.get(filePart.getSubmittedFileName()).getFileName().toString(); // MSIE fix.
InputStream fileContent = filePart.getInputStream();
// ... (do your job here)
}
}
When you're not on Servlet 3.1 yet, manually get submitted file name
Note that Part#getSubmittedFileName()
was introduced in Servlet 3.1 (Tomcat 8, Jetty 9, WildFly 8, GlassFish 4, etc). If you're not on Servlet 3.1 yet, then you need an additional utility method to obtain the submitted file name.
private static String getSubmittedFileName(Part part) {
for (String cd : part.getHeader("content-disposition").split(";")) {
if (cd.trim().startsWith("filename")) {
String fileName = cd.substring(cd.indexOf('=') + 1).trim().replace("\"", "");
return fileName.substring(fileName.lastIndexOf('/') + 1).substring(fileName.lastIndexOf('\\') + 1); // MSIE fix.
}
}
return null;
}
String fileName = getSubmittedFileName(filePart);
Note the MSIE fix as to obtaining the file name. This browser incorrectly sends the full file path along the name instead of only the file name.
When you're not on Servlet 3.0 yet, use Apache Commons FileUpload
If you're not on Servlet 3.0 yet (isn't it about time to upgrade?), the common practice is to make use of Apache Commons FileUpload to parse the multpart form data requests. It has an excellent User Guide and FAQ (carefully go through both). There's also the O'Reilly ("cos") MultipartRequest
, but it has some (minor) bugs and isn't actively maintained anymore for years. I wouldn't recommend using it. Apache Commons FileUpload is still actively maintained and currently very mature.
In order to use Apache Commons FileUpload, you need to have at least the following files in your webapp's /WEB-INF/lib
:
Your initial attempt failed most likely because you forgot the commons IO.
Here's a kickoff example how the doPost()
of your UploadServlet
may look like when using Apache Commons FileUpload:
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
try {
List<FileItem> items = new ServletFileUpload(new DiskFileItemFactory()).parseRequest(request);
for (FileItem item : items) {
if (item.isFormField()) {
// Process regular form field (input type="text|radio|checkbox|etc", select, etc).
String fieldName = item.getFieldName();
String fieldValue = item.getString();
// ... (do your job here)
} else {
// Process form file field (input type="file").
String fieldName = item.getFieldName();
String fileName = FilenameUtils.getName(item.getName());
InputStream fileContent = item.getInputStream();
// ... (do your job here)
}
}
} catch (FileUploadException e) {
throw new ServletException("Cannot parse multipart request.", e);
}
// ...
}
It's very important that you don't call getParameter()
, getParameterMap()
, getParameterValues()
, getInputStream()
, getReader()
, etc on the same request beforehand. Otherwise the servlet container will read and parse the request body and thus Apache Commons FileUpload will get an empty request body. See also a.o. ServletFileUpload#parseRequest(request) returns an empty list.
Note the FilenameUtils#getName()
. This is a MSIE fix as to obtaining the file name. This browser incorrectly sends the full file path along the name instead of only the file name.
Alternatively you can also wrap this all in a Filter
which parses it all automagically and put the stuff back in the parametermap of the request so that you can continue using request.getParameter()
the usual way and retrieve the uploaded file by request.getAttribute()
. You can find an example in this blog article.
Workaround for GlassFish3 bug of getParameter()
still returning null
Note that Glassfish versions older than 3.1.2 had a bug wherein the getParameter()
still returns null
. If you are targeting such a container and can't upgrade it, then you need to extract the value from getPart()
with help of this utility method:
private static String getValue(Part part) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(part.getInputStream(), "UTF-8"));
StringBuilder value = new StringBuilder();
char[] buffer = new char[1024];
for (int length = 0; (length = reader.read(buffer)) > 0;) {
value.append(buffer, 0, length);
}
return value.toString();
}
String description = getValue(request.getPart("description")); // Retrieves <input type="text" name="description">
Saving uploaded file (don't use getRealPath()
nor part.write()
!)
Head to the following answers for detail on properly saving the obtained InputStream
(the fileContent
variable as shown in the above code snippets) to disk or database:
Serving uploaded file
Head to the following answers for detail on properly serving the saved file from disk or database back to the client:
Ajaxifying the form
Head to the following answers how to upload using Ajax (and jQuery). Do note that the servlet code to collect the form data does not need to be changed for this! Only the way how you respond may be changed, but this is rather trivial (i.e. instead of forwarding to JSP, just print some JSON or XML or even plain text depending on whatever the script responsible for the Ajax call is expecting).
Hope this all helps :)
Best Answer
Per request means when an HTTP request is made, a thread is created or retrieved from a pool to serve it. One thread serves the whole request. Thread per connection would be the same thing except the thread is used for an entire connection, which could be multiple requests and could also have a lot of dead time in between requests. Servlet containers are thread per request. There may be some implementations that offer thread per connection, but I don't know, and it seems like it would be very wasteful.
Creating a thread inside another thread doesn't establish any special relationship, and the whole point of doing so in most cases is to let the one thread do more work or terminate while the other thread continues working. In your scenario, using a different thread to do work required by a request will, as you expect, allow the response to be sent immediately. The thread used to serve that request will also be immediately available for another request, regardless of how long your other thread takes to complete. This is pretty much the way of doing asynchronous work in a thread-per-request servlet container.
Caveat: If you're in a full Java EE container, threads may be managed for you in a way that makes it a bad idea to spawn your own. In that case, you're better off asking the container for a thread, but the general principles are the same.