In Tensorflow, the binary cross entropy loss function is implemented in a way to ensure stability and avoid overflow. The formulation can be found in the official doc. But it’s not very easy to follow when it’s written in pseudo-code. So I decided to type it in TeX (replacing the notation $z$ by $y$).
The logistic loss is
\[\begin{align*}
\mathcal{L} &= - y \log(p) - (1 - y) \log(1-p) \\
&= - y \log(\operatorname{sigmoid}(x)) - (1 - y) \log(1-\operatorname{sigmoid}(x)) \\
&= - y \log \left(\frac{1}{1+e^{-x}} \right) - (1 - y) \log \left(1-\frac{1}{1+e^{-x}} \right) \\
&= - y \log \left(\frac{1}{1+e^{-x}} \right) - (1 - y) \log \left(\frac{e^{-x}}{1+e^{-x}} \right) \\
&= y \log({1+e^{-x}}) + (1 - y)\left[- \log(e^{-x}) + \log({1+e^{-x}}) \right] \\
&= y \log({1+e^{-x}}) + (1 - y)\left[x + \log({1+e^{-x}}) \right] \\
&= (1 - y)(x) + \log({1+e^{-x}}) \\
&= x - x \times y + \log({1+e^{-x}})
\end{align*}\]
For $x < 0$, to avoid overflow in $e^{-x}$, we reformulate the above
\[\begin{align*}
\mathcal{L} &= x - x \times y + \log({1+e^{-x}}) \\
&= \log(e^{x}) - x \times y + \log({1+e^{-x}}) \\
&= - x \times y + \log(e^{x} \times ({1+e^{-x}})) \\
&= - x \times y + \log(1 + e^{x})
\end{align*}\]
Hence, to ensure stability and avoid overflow, the implementation uses this equivalent formulation
\[\begin{align*}
\mathcal{L} &= \max(x,0) - x \times y + \log({1+e^{-|x|}}) \\
&= \operatorname{ReLU(x)} - x \times y + \log({1+e^{-|x|}})
\end{align*}\]
(To be more clear, the last formulation is used to combine $x - x \times y + \log({1+e^{-x}})$ when $x \geq 0$ and $- x \times y + \log(1 + e^{x})$ when $x < 0$).