Read all text from a file
Java 11 added the readString() method to read small files as a String
, preserving line terminators:
String content = Files.readString(path, StandardCharsets.US_ASCII);
For versions between Java 7 and 11, here's a compact, robust idiom, wrapped up in a utility method:
static String readFile(String path, Charset encoding)
throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, encoding);
}
Read lines of text from a file
Java 7 added a convenience method to read a file as lines of text, represented as a List<String>
. This approach is "lossy" because the line separators are stripped from the end of each line.
List<String> lines = Files.readAllLines(Paths.get(path), encoding);
Java 8 added the Files.lines()
method to produce a Stream<String>
. Again, this method is lossy because line separators are stripped. If an IOException
is encountered while reading the file, it is wrapped in an UncheckedIOException
, since Stream
doesn't accept lambdas that throw checked exceptions.
try (Stream<String> lines = Files.lines(path, encoding)) {
lines.forEach(System.out::println);
}
This Stream
does need a close()
call; this is poorly documented on the API, and I suspect many people don't even notice Stream
has a close()
method. Be sure to use an ARM-block as shown.
If you are working with a source other than a file, you can use the lines()
method in BufferedReader
instead.
Memory utilization
The first method, that preserves line breaks, can temporarily require memory several times the size of the file, because for a short time the raw file contents (a byte array), and the decoded characters (each of which is 16 bits even if encoded as 8 bits in the file) reside in memory at once. It is safest to apply to files that you know to be small relative to the available memory.
The second method, reading lines, is usually more memory efficient, because the input byte buffer for decoding doesn't need to contain the entire file. However, it's still not suitable for files that are very large relative to available memory.
For reading large files, you need a different design for your program, one that reads a chunk of text from a stream, processes it, and then moves on to the next, reusing the same fixed-sized memory block. Here, "large" depends on the computer specs. Nowadays, this threshold might be many gigabytes of RAM. The third method, using a Stream<String>
is one way to do this, if your input "records" happen to be individual lines. (Using the readLine()
method of BufferedReader
is the procedural equivalent to this approach.)
Character encoding
One thing that is missing from the sample in the original post is the character encoding. There are some special cases where the platform default is what you want, but they are rare, and you should be able justify your choice.
The StandardCharsets
class defines some constants for the encodings required of all Java runtimes:
String content = readFile("test.txt", StandardCharsets.UTF_8);
The platform default is available from the Charset
class itself:
String content = readFile("test.txt", Charset.defaultCharset());
Note: This answer largely replaces my Java 6 version. The utility of Java 7 safely simplifies the code, and the old answer, which used a mapped byte buffer, prevented the file that was read from being deleted until the mapped buffer was garbage collected. You can view the old version via the "edited" link on this answer.
There are different ways to delete an array element, where some are more useful for some specific tasks than others.
Deleting a single array element
If you want to delete just one array element you can use unset()
or alternatively \array_splice()
.
If you know the value and don’t know the key to delete the element you can use \array_search()
to get the key. This only works if the element does not occur more than once, since \array_search
returns the first hit only.
Note that when you use unset()
the array keys won’t change. If you want to reindex the keys you can use \array_values()
after unset()
, which will convert all keys to numerically enumerated keys starting from 0.
Code:
$array = [0 => "a", 1 => "b", 2 => "c"];
unset($array[1]);
// ↑ Key which you want to delete
Output:
[
[0] => a
[2] => c
]
If you use \array_splice()
the keys will automatically be reindexed, but the associative keys won’t change — as opposed to \array_values()
, which will convert all keys to numerical keys.
\array_splice()
needs the offset, not the key, as the second parameter.
Code:
$array = [0 => "a", 1 => "b", 2 => "c"];
\array_splice($array, 1, 1);
// ↑ Offset which you want to delete
Output:
[
[0] => a
[1] => c
]
array_splice()
, same as unset()
, take the array by reference. You don’t assign the return values of those functions back to the array.
Deleting multiple array elements
If you want to delete multiple array elements and don’t want to call unset()
or \array_splice()
multiple times you can use the functions \array_diff()
or \array_diff_key()
depending on whether you know the values or the keys of the elements which you want to delete.
If you know the values of the array elements which you want to delete, then you can use \array_diff()
. As before with unset()
it won’t change the keys of the array.
Code:
$array = [0 => "a", 1 => "b", 2 => "c", 3 => "c"];
$array = \array_diff($array, ["a", "c"]);
// └────────┘
// Array values which you want to delete
Output:
[
[1] => b
]
If you know the keys of the elements which you want to delete, then you want to use \array_diff_key()
. You have to make sure you pass the keys as keys in the second parameter and not as values. Keys won’t reindex.
Code:
$array = [0 => "a", 1 => "b", 2 => "c"];
$array = \array_diff_key($array, [0 => "xy", "2" => "xy"]);
// ↑ ↑
// Array keys which you want to delete
Output:
[
[1] => b
]
If you want to use unset()
or \array_splice()
to delete multiple elements with the same value you can use \array_keys()
to get all the keys for a specific value and then delete all elements.
Best Answer
Matches one or more characters not a-z 0-9 [case-insensitive], or "." and replaces with ""