This paper examines the application of reinforcement learning to a telecommunications networking problem . The problem requires that rev(cid:173) enue be maximized while simultaneously meeting a quality of service constraint that forbids entry into certain states. We present a general solution to this multi-criteria problem that is able to earn significantly higher revenues than alternatives.