如何转换一串俄罗斯西里尔字母？

我解析mp3标签。

String artist – 我不知道编码是什么

Ïåñíÿ ïðî íàäåæäó – 俄语中的示例字符串"Песня про надежду"

我使用http://code.google.com/p/juniversalchardet/

码：

 String GetEncoding(String text) throws IOException { byte[] buf = new byte[4096]; InputStream fis = new ByteArrayInputStream(text.getBytes()); UniversalDetector detector = new UniversalDetector(null); int nread; while ((nread = fis.read(buf)) > 0 && !detector.isDone()) { detector.handleData(buf, 0, nread); } detector.dataEnd(); String encoding = detector.getDetectedCharset(); detector.reset(); return encoding; }

并且隐蔽

new String(text.getBytes(encoding), "cp1251"); – 但这不起作用。

如果我使用utf-16

new String(text.getBytes("UTF-16"), "cp1251")返回“юяПеснаддодадддуд”space – not is char space

编辑：

这第一个读取字节

 byte[] abyFrameData = new byte[iTagSize]; oID3DIS.readFully(abyFrameData); ByteArrayInputStream oFrameBAIS = new ByteArrayInputStream(abyFrameData);

String s = new String（abyFrameData，“????”）;

Java字符串是UTF-16。所有其他编码可以使用字节序列表示。要解码字符数据，必须在首次创建字符串时提供编码。如果你有一个损坏的字符串，它已经太晚了。

假设ID3，规范定义了编码规则。例如， ID3v2.4.0可能会限制通过扩展标头使用的编码：

q – 文本编码限制

  0 No restrictions 1 Strings are only encoded with ISO-8859-1 [ISO-8859-1] or UTF-8 [UTF-8].

编码处理在文档的下方进一步定义：

如果没有其他说法，字符串（包括数字字符串和URL）在$ 20 – $ FF范围内表示为ISO-8859-1字符。如果允许换行，此类字符串在帧描述中表示为或。如果没有其他说法，则禁止使用换行符。在ISO-8859-1中，如果允许，则表示换行符仅为$ 0A。

允许不同类型文本编码的帧包含文本编码描述字节。可能的编码：
  $00 ISO-8859-1 [ISO-8859-1]. Terminated with $00. $01 UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All strings in the same frame SHALL have the same byteorder. Terminated with $00 00. $02 UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM. Terminated with $00 00. $03 UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00. 

使用转码类（如InputStreamReader或（在本例中更有可能） String(byte[],Charset)构造函数来解码数据。另请参见Java：字符编码的粗略指南。

解析ID3v2.4.0数据结构的字符串组件将是这样的：

 //untested code public String parseID3String(DataInputStream in) throws IOException { String[] encodings = { "ISO-8859-1", "UTF-16", "UTF-16BE", "UTF-8" }; String encoding = encodings[in.read()]; byte[] terminator = encoding.startsWith("UTF-16") ? new byte[2] : new byte[1]; byte[] buf = terminator.clone(); ByteArrayOutputStream buffer = new ByteArrayOutputStream(); do { in.readFully(buf); buffer.write(buf); } while (!Arrays.equals(terminator, buf)); return new String(buffer.toByteArray(), encoding); }

这对我有用：

 byte[] bytes = s.getBytes("ISO-8859-1"); UniversalDetector encDetector = new UniversalDetector(null); encDetector.handleData(bytes, 0, bytes.length); encDetector.dataEnd(); String encoding = encDetector.getDetectedCharset(); if (encoding != null) s = new String(bytes, encoding);

如何转换一串俄罗斯西里尔字母？

如何获取Swing元素的屏幕位置？

H2 Java插入忽略 – 允许exception

Maven依赖的必需Java版本？

非法字符 – CTRL-CHAR

在Web应用程序中运行applet

如何在j2me中读/写文本文件

JLabel setText不起作用

为什么instanceof不能使用Generic？

从PHP exec调用java

向GridPane JavaFX添加边框