有的时候查看文件格式的时候会发现有趣的事情
[root@feel ~]# file system.cfg
system.cfg: Non-ISO extended-ASCII text, with CRLF, NEL line terminators
with CRLF, NEL line terminators可以判断是windwos的文件
linux文本格式为:ASCII text
windows文本格式为:ASCII text,with CRLF line terminators
通过cat -v 文件名命令查看文本内容,行尾有^M符号
yum install dos2unix -y
dos2unix system.cfg
可以变更为ASCII text
通过enca判断文件编码
yum install enca
[root@feel ~]# enca –list languages
belarusian: CP1251 IBM866 ISO-8859-5 KOI8-UNI maccyr IBM855 KOI8-U
bulgarian: CP1251 ISO-8859-5 IBM855 maccyr ECMA-113
czech: ISO-8859-2 CP1250 IBM852 KEYBCS2 macce KOI-8_CS_2 CORK
estonian: ISO-8859-4 CP1257 IBM775 ISO-8859-13 macce baltic
croatian: CP1250 ISO-8859-2 IBM852 macce CORK
hungarian: ISO-8859-2 CP1250 IBM852 macce CORK
lithuanian: CP1257 ISO-8859-4 IBM775 ISO-8859-13 macce baltic
latvian: CP1257 ISO-8859-4 IBM775 ISO-8859-13 macce baltic
polish: ISO-8859-2 CP1250 IBM852 macce ISO-8859-13 ISO-8859-16 baltic CORK
russian: KOI8-R CP1251 ISO-8859-5 IBM866 maccyr
slovak: CP1250 ISO-8859-2 IBM852 KEYBCS2 macce KOI-8_CS_2 CORK
slovene: ISO-8859-2 CP1250 IBM852 macce CORK
ukrainian: CP1251 IBM855 ISO-8859-5 CP1125 KOI8-U maccyr
chinese: GBK BIG5 HZ
none:
[root@feel ~]# enca -L chinese system.cfg
Unrecognized encoding
enca会显示是编码
[root@feel ~]# enca -L czech system.cfg
Kamenicky encoding; KEYBCS2
iconv –list 查看iconv支持的编码
可以写个脚本一个一个测试编码
脚本如下
#!/bin/bash
iconv –list | sed ‘s/\/\/$//’ | sort > encodings.list
for a in cat encodings.list
; do
printf “$a “
iconv -f $a -t UTF-8 system.cfg > /dev/null 2>&1 && echo “ok: $a” || echo “fail: $a”
done | tee result.txt
跑脚本后查看result.txt里的ok 就可以了。
这样就能转换好文件编码了。
iconv -f 源编码 -t UTF-8 转换前文件名 > 转换后文件名