Working with Text |
Thejava.io
package provides classes that allow you to convert between Unicode character streams and byte streams of non-Unicode text. With theInputStreamReader
class, you can convert byte streams to character streams. You use theOutputStreamWriter
class to translate character streams into byte streams. The following figure illustrates the conversion process:When you create
InputStreamReader
andOutputStreamWriter
objects, you specify the byte encoding that you want to convert. For example, to translate a text file in the UTF-8 encoding into Unicode, you create anInputStreamReader
as follows:FileInputStream fis = new FileInputStream("test.txt"); InputStreamReader isr = new InputStreamReader(fis, "UTF8");If you omit the encoding identifier,
InputStreamReader
andOutputStreamWriter
rely on the default encoding. You can determine which encoding anInputStreamReader
orOutputStreamWriter
uses by invoking thegetEncoding
method, as follows:InputStreamReader defaultReader = new InputStreamReader(fis); String defaultEncoding = defaultReader.getEncoding();The example that follows shows you how to perform character-set conversions with the
InputStreamReader
andOutputStreamWriter
classes. The full source code for this example is inStreamConverter.java
. This program displays Japanese characters. Before trying it out, verify that the appropriate fonts have been installed on your system. If you are using the JDK software that is compatible with version 1.1, make a copy of thefont.properties
file and then replace it with thefont.properties.ja
file.The
StreamConverter
program converts a sequence of Unicode characters from aString
object into aFileOutputStream
of bytes encoded in UTF-8. The method that performs the conversion is calledwriteOutput
:static void writeOutput(String str) { try { FileOutputStream fos = new FileOutputStream("test.txt"); Writer out = new OutputStreamWriter(fos, "UTF8"); out.write(str); out.close(); } catch (IOException e) { e.printStackTrace(); } }The
readInput
method reads the bytes encoded in UTF-8 from the file created by thewriteOutput
method. AnInputStreamReader
object converts the bytes from UTF-8 into Unicode and returns the result in aString
. ThereadInput
method is as follows:static String readInput() { StringBuffer buffer = new StringBuffer(); try { FileInputStream fis = new FileInputStream("test.txt"); InputStreamReader isr = new InputStreamReader(fis, "UTF8"); Reader in = new BufferedReader(isr); int ch; while ((ch = in.read()) > -1) { buffer.append((char)ch); } in.close(); return buffer.toString(); } catch (IOException e) { e.printStackTrace(); return null; } }The
main
method of theStreamConverter
program invokes thewriteOutput
method to create a file of bytes encoded in UTF-8. ThereadInput
method reads the same file, converting the bytes back into Unicode. Here is the source code for themain
method:public static void main(String[] args) { String jaString = new String("\u65e5\u672c\u8a9e\u6587\u5b57\u5217"); writeOutput(jaString); String inputString = readInput(); String displayString = jaString + " " + inputString; new ShowString(displayString, "Conversion Demo"); }The original string (
jaString
) should be identical to the newly created string (inputString
). To show that the two strings are the same, the program concatenates them and displays them with aShowString
object. TheShowString
class displays a string with theGraphics.drawString
method. The source code for this class is inShowString.java
. When theStreamConverter
program instantiatesShowString
, the following window appears. The repetition of the characters displayed verifies that the two strings are identical:
Working with Text |