handle HTML entity characters properly+priority