Why you need TWO reports for Coordinated Access

Now that HIFIS version 4.0.59 is out and has been widely adopted, several communities have come to me asking for customized By-Name Lists (or Prioritization Lists). This report includes a list of clients and columns for various prioritization criteria, including ones not found in the Coordinated Access module. Sometimes, however, communities ask for columns to indicate whether the client was newly identified, or recently returned from inactive or housing. That's all fine.

But sometimes they also ask for the report to indicate whether the client was recently housed or moved to inactive.

That's a problem.

Why? Because the prioritization list started out by displaying only the clients who were active and homeless, and criteria you'd use to prioritize them. All of a sudden, by including columns asking if they were recently housed or moved to inactive, the report can no longer filter to just show those who are active and homeless. Instead, it needs to show basically everybody.

Let's say you have 1000 clients in your database and 100 of them are actively homeless today. If you just want to show the actively homeless ones, we would tell the database: find each client that is both active and homeless right now. That's 1000 queries (that's the fancy database word). Let's imagine that each query is 1 time unit, so we're at 1000 units of time. Although your server speed may vary, you could imagine that your server might do 1000 queries a minute, so we're currently at 1 minute of runtime. This is just a wild estimate! Don't take my word for the exact timing here.

But, if you want everyone who is actively homeless or inactively homeless or actively housed, then we now need to examine all 1000 records, look up the history for them and see if they meet one of the possible conditions that would include them in a report. It might go something like this:
  1. Is the client active and homeless right now (query 1)?
  2. Is the client currently active and housed (query 2)? If so, were they active and homeless sometime in the last month (query 3)?
  3. Is the client currently inactive and homeless (query 4)? If so, were they active and homeless sometime in the last month (query 5)?
Basically, the fact that you're checking each client file in your entire database for multiple reasons why they might be included takes additional processing time. Right now, we're using 5000 units of time, five times as much as if we were just looking at the actively homeless clients. Assuming again that 1000 queries = 1 minute, we're now at 5 minutes.

So what, you say? My server is super speedy, and I don't have that many clients. Why should it matter?

The answer is because you really have two different needs:
  1. A report that shows you who is currently homeless today, who you'd like to prioritize and see lots of data about them.
  2. A report that shows who had housing transitions in a period of time, but you don't need to see all the prioritization data about them.
Because these are quite distinct, you need to display different data for them. Trying to do it all in one place is going to be very resource-intensive because you're displaying data you don't need.

After your initial queries to decide who should be included, you then want to display extra data. For report 1, let's say you have 30 prioritization columns, which you want to display for each of your 100 clients. Each of those 30 columns represents additional queries: 100 * 30 is 3000 more queries, or 3000 more units of time, which we can add to our initial 1000 units of time to find the right clients. Altogether, this report takes 4000 units of time to process (4 minutes).

For report 2, let's imagine that we had 100 active clients, as well as 25 outflow clients (125 total). We don't, however need as many columns, let's say only 5. That's 125 clients times 5 queries each: 125 * 5 is 625 units of time. We will add that to the initial 5000 units for a total of 5625 units of time for report 2.

But if you combined it into one report, you'd have 125 clients (from report 2) * 30 columns (from report 1) + 5 columns (from report 2), so that's 4375 units of time. And we'll add that onto the 5000 time units it takes to figure out what clients we need to include for a total of 9375 time units. Although technically this is less time than it would take to run two different reports consecutively, the key difference is this: if it was all combined, you'd have a single report that took 9375 units of time all in one go, rather than split up into two, so you're much more likely to time out or give your server a hard time. Using our previous estimate of 1000 time units = 1 minute, you'd be pushing 10 minutes here, which is the timeout limit for many communities.

My point is that, even though it looks to you like you might only be including an extra 25% more clients, it takes twice as long to compute and run!

Instead, I strongly advise that you have one report that just does prioritization for your actively homeless clients, and a second report that looks at your inflow and outflow. It's much more efficient to break it up that way! (And it doesn't necessarily cost more. There's lots of work being done on these fronts even as you read this!)

Check out some recent prioritization lists that we've worked on for communities:
And, for your inflow and outflow:
How to Record Street Outreach in HIFIS 4 (Revisited)
New Feature: Clone Users

0 comments

There are no comments yet. Be the first one to leave a comment!