Yes they can. If you take a simple problem and compute the derivatives with back propagation, you will see that the gradients are scaled by the value of your cost error.
Now imagine, if your cost is CE and 100*CE, in a convex optimization problem you will reach the same local minimum for the parameters as 100 is a constant value. However, the gradients are not scaled by the same value so the same learning rate can make your model oscillate.
Now, regarding different weights applied to the CE. Suppose a ternary problem with (unbalanced) classes c1, c2 and c3. In a normal setting the cost is computed as:
CE = \frac{1}{N} \sum CE(xi,ci)
where N is the total number of samples. So if you want to apply different weights to the cross entropy, such as:
CE = \frac{1}{N} \sum w_i \cdot CE(xi,ci)
in order to give more importance of the unbalanced class. The ideal thing is to give a relative importance such that the cost have the same relative value as if you do not use this weights. For instance given three samples one for each class. You can compute
CE = 1\cdot CE(x1,c1) + 2\cdot CE(x2,c2) + 3\cdot CE(x3,c3)
CE = 2\cdot CE(x1,c1) + 4\cdot CE(x2,c2) + 6\cdot CE(x3,c3)
CE = 100\cdot CE(x1,c1) + 200 \cdot CE(x2,c2) + 300 \cdot CE(x3,c3)
In all these cases the relative importance given to each cost is the same, however the gradients would not scale the same. Thus, the best way is to use normalized weights. Which in this case would be:
CE = 1/6 \cdot CE(x1,c1) + 2/6 \cdot CE(x2,c2) + 3/6 \cdot CE(x3,c3)
Hope it helps.