Documenting legacy code-bases
I would highly recommend following the scout rule with legacy code-bases.
Trying to document a legacy project independently of working on it will just never happen. Even if you get in contractors to do it, as soon as they finish the project, that documentation will start falling behind all over again, because developers haven't got into the habit of updating it.
In-code documentation
The most important thing is to use the documentation facilities in your chosen development environment, so that means pydoc for python, javadoc in java or xml comments in C#. These make it easy to write the documentation at the same time as writing the code.
If you rely on coming back and documenting things later, you may not get around to it, but if you do it as you are writing the code, what needs to be documented will be fresh in your mind. C# even has the option to issue a compilation warning if the XML documentation is incomplete or inconsistent with the actual code.
Also, if reviewing documentation becomes part of your code review process, everyone can be encouraged to contribute, fostering a sense of ownership of the documentation as well as of the code.
Tests as documentation
Another important aspect is having good integration and unit tests.
Often documentation concentrates on what classes and methods do in isolation, skipping over how they are used together to solve your problem. Tests often put these into context by showing how they interact with each other.
Similarly, unit-tests often point out external dependencies explicitly through which things need to be Mocked out.
I also find that using Test-driven development I write software which is easier to use, because I'm using it right from the word go. With a good testing framework, making code easier to test and making it easy to use are often the same thing.
Higher level documentation
Finally there is what to do about system level and architectural documentation.
Many would advocate writing such documentation in a wiki or using Word or other word processor, but for me the best place for such documentation is also alongside the code, in a plain text format that is version control system friendly.
Just like with in-code documentation, if you store your higher level documentation in your code repository then you are more likely to keep it up to date. You also get the benefit that when you pull out version X.Y of the code, you also get version X.Y of the documentation. In addition, if you use a VCS friendly format, then it means that it is easy to branch, diff and merge, just like your code.
Not only that, but if you use something like readthedocs then you can publish version specific documentation for each software release.
I quite like rst, as it is easy to produce both html pages and pdf documents from it, and is much friendlier than LaTeX, yet can still include LaTeX math expressions when you need them.
Self-documenting code (and in-code comments) and Javadoc comments have two very different target audiences.
The code and comments that remain in the code file are for developers. You want to address their concerns here - make it easy to understand what the code does and why it is the way it is. The use of appropriate variable names, methods, classes, and so on (self-documenting code) coupled with comments achieves this.
Javadoc comments are typically for users of the API. These are also developers, but they don't care about the system's internal structure, just the classes, methods, inputs, and outputs of the system. The code is contained within a black box. These comments should be used to explain how to do certain tasks, what the expected results of operations are, when exceptions are thrown, and what input values mean. Given a Javadoc-generated set of documentation, I should fully understand how to use your interfaces without ever looking at a line of your code.
Best Answer
Literate programming is the nice idea that you can write your code together with an explanation or walkthrough of that code. Importantly, you are not constrained by the syntax of the underlying programming language but can structure your literate program in any way to want. (Literate programming involves chunks of code embedded into text, not comments into code.)
There are three huge problems with literate programming: it takes a lot of effort, there is little tooling, and changes become more difficult.
Documentation always requires effort. Literate programming requires less effort than maintaining separate documentation of comparable quality. However, this amount of effort is still unwarranted for most kinds of code. A lot of code is not interesting and requires little discussion, it's mostly just delegating stuff to some framework. The kind of tricky logic that benefits from literate programming is comparatively rare.
While there are various tools for literate programming (including Knuth's original WEB, and decent support in the Haskell ecosystem), they all suck. The next-best thing I've come across is org-mode, but that requires the use of Emacs. The problem is that programming is more than typing letters, it's also debugging and navigating code, which benefits greatly from an IDE-style experience. Auto-complete is non-negotiable! Literate programming tools also tend to require non-standard build processes, or mess up line numbers in error messages – not acceptable. If a tool makes your code easier to understand but harder to debug, that's not necessarily a good choice.
Related to this is the issue that changes to literately programmed software become more difficult. When you refactor code, you also have to restructure the document. But while you have a compiler or linter to ensure that your code continues to make sense, there's no guarantee that you haven't disrupted the structure of the document. Literate programming is writing and programming to equal parts.
So while full-blown literate programming does not seem to have a place in modern software development, it is still possible to reap some of the benefits. Consider in particular that literate programming is now over 35 years old, so a lot has happened in the meanwhile.
Extracting a function with a useful name has many of the same benefits of a chunk of code in literate programming. It's arguably even better because variable names get their separate scope. Most programming languages allow functions to be defined in an arbitrary order, which also allows you to structure the source code within a file in a sensible manner.
Literate programming can be used to describe the “why” of a code in a human-readable manner. A somewhat related idea is to express requirements for your program in a both human- and machine-readable format, e.g. as suggested by BDD. This forms a kind of executable specification.
Some markup languages have the ability to pull code snippets from your source code. This lets the code be code and lets you construct a narrative around these snippets, without having to duplicate, copy, or update the code. Unfortunately, the popular Markdown has no built-in mechanism for that (but RST, AsciiDoc, and Latex+listings do). This is possibly the best current alternative for creating literate programming-style documents.