Category Archives: leetcode

Leetcode [Hard] – “Split Array Largest Sum” in C++

There are some problems on Leetcode that are medium or hard that if you know the trick or have seen it before, you will know how to do it, otherwise, there is not a single way you could come up with solutions under limited time during an interview. There are also types of problems either hard, medium, or easy that you can solve solely with your intelligence or repetitive practicing on similar types of problems.

A Hard problem that I encountered covers both the categories I mentioned above, Split Array Largest Sum. There are two approaches to solve this problem, one with dynamic programming, which you can eventually come up with with enough time given and experience in dynamic programming. Another solution that is much faster, but much trickier. I rather don’t believe anyone could come up with that solution in an interview unless they have seen it before or they are genius-level good at problem-solving.

Let’s look at the problem first.

Given an array nums which consists of non-negative integers and an integer m, you can split the array into m non-empty continuous subarrays.

Write an algorithm to minimize the largest sum among these m subarrays.

Example 1:

Input: nums = [7,2,5,10,8], m = 2
Output: 18
Explanation:
There are four ways to split nums into two subarrays.
The best way is to split it into [7,2,5] and [10,8],
where the largest sum among the two subarrays is only 18.

Example 2:

Input: nums = [1,2,3,4,5], m = 2
Output: 9

Example 3:

Input: nums = [1,4,4], m = 3
Output: 4

Constraints:

  • 1 <= nums.length <= 1000
  • 0 <= nums[i] <= 106
  • 1 <= m <= min(50, nums.length)
https://leetcode.com/problems/split-array-largest-sum/

This was listed under the dynamic programming category, and you should be considering using DP whenever seeing keywords like “subarray”, “minimum/maximum”, “largest/smallest”, etc.

I will provide three solutions for this problem: dynamic programming top-down, the slowest among all three, but should be obvious to come up with. Dynamic programming bottom-up, slightly faster than the top-down method, but not at all competitive among other people’s submissions. And lastly, the fastest one, binary search method.

Solution 1 – Dynamic Programming Top-down

In almost all cases of dynamic programming problems that are easily recognizable, the first approach is to come up with a brute-force solution, which is relatively challenging for this specific problem, but once you got it running, try to come up with a DP table to minimize the re-computation for the same input and maximize the reusability of stored output.

Approach

  1. Notice for this problem, if we want to implement the brute-force solution, then we will have to constantly compute the sum of subarrays of the input vector. To turn this O(n) computation into an O(1), we simply implement another vector called prefixSum with the same size as nums and set each element at index i = the sum of all elements in nums before and including i. This way, whenever we want to compute the sum of subarray between i and j, simply compute prefixSum[i] – prefixSum[j] instead of having a loop that computes the sum from i to j.
  2. As for the construction of the DP table, we create a 2-d matrix with respect to the starting indices and m, dp[start][m]. For every entry in the table, they represent the minimum largest sum of subarrays starting from index start with the number of m. The base case for this DP table is when m is 1, then the minimum largest subarray sum is prefixSum[size of nums] – prefixSum[start]. Since we only need one subarray, the maximum sum would just be the sum of the subarray from the index we are current;y checking to the last element in nums.
  3. Then, for every starting index that we iterate, recursively compute the largest subarray sum, then memorize the minimum largest subarray sum, and store the result to the according position in the DP table.

Algorithm

1. Fill in prefixSum list.
2. Recursively find the minimum largest subarray sum of nums starting for (start = 0 and m = give m).
     1. If result can be found in dp table, return the result immediately.
     2. if m = 1, return prefixSum[size of nums] - prefixSum[start]
     3. For all indices between start and size of nums - m
          1. find the sum of the subarray starting from beginning to start
          2. find the largest subarray sum with start + 1 and m - 1 by making a recursive call.
          3. if the sum of subarray from beginning to start is greater than the largest subarray sum after start, there is no point moving forward.
     4. store the result in dp table and return it.

Code Implementation

class Solution {
public:
    int dp[1001][1000];
    
    int splitArray(vector<int>& nums, int m) {
        memset(dp, -1, sizeof(dp));
        vector<int> prefixSum(nums.size() + 1, 0);
        for (int i = 0; i < nums.size(); i++) {
            prefixSum[i + 1] = prefixSum[i] + nums[i];
        }
        return split (m, 0, prefixSum);
    }
    
    int split (int m, int start, vector<int>& pref) {
        auto n = (int)pref.size() - 1;
        
        if (dp[m][start] != -1) return dp[m][start];
        
        if (m == 1) return dp[m][start] = pref[n] - pref[start];
        
        auto min = INT_MAX;
        
        for (auto i=start; i<=n-m; ++i) {
            auto curr = pref[i+1] - pref[start];
            
            auto max = ::max(curr, split(m-1, i+1, pref));
            
            min = ::min (min, max);
            
            // since start is always increasing, that means max is also always increasing is i++, since curr > max.
            if (curr >= max) break;
        }
        
        return dp[m][start] = min;
    }
};

Solution 2 – Dynamic Programming Bottom-up

Approach

After having the top-down version of DP implementation running, notice some entries reply to other entries. Moreover, you can initialize many of the entries, and start filling out the table from there.

The bottom-up approach of DP is usually faster, but not as obvious as the top-down. The best way to come up with a bottom-up implementation is to re-create the table and fill it up starting from the base case, then you will likely find the pattern that is easily implementable with DP.

Algorithm

1. similar to top-down approach, have a prefixSum list ready.
2. For all i := numbers of subarray m starting from 1
3. For all j := starting index from 0 to size of nums
     1. When i is 1, which is base case, fill the dp table.
     2. For starting index k := from j to size of nums - i
          1. FirstSum := sum of subarray from k+1 to j
          2. largestSum := max (FirstSum, dp[k+1][i-1])
          3. minimumSum = min(minimumSum, largetSum)
          4. If FirstSum > minimSum, exit loop
      3. Store minimumSum in dp table
4. return dp[0][m]

Solution 3 – Binary Search

For this solution, we start by simply “guessing” the solution, then adjust our guessing by examing how the result of our guess compares to our desired output. And by guess, I mean using binary search.

Approach

  1. Let’s “guess” a minimum largest subarray sum X, which must be greater than or equal to the largest number in nums, otherwise, we will have to split a single entry, which is not possible. X also has to be less than or equal to the sum of all numbers in nums since it’s the maximum value we can get when m = size of nums.
  2. Then, based on our guess, we find out how many split arrays we need so that every subarray sum is less than or equal to X. The way to do this is just to iterate through nums and have a sum that keeps track of the sum of all numbers in the current subarray. As soon as sum is greater than X, we increment number of subarrays and return the number of subarrays + 1.
  3. Based on the number of subarrays computed above, if its <= m, record X because if the current number of subarrays is a good result for X, then its also a good result for any number above X since all subarray sums are less than X. If it’s greater than m, then we will have to increment X since we need fewer subarrays, thus, increase X will have more numbers in each subarray.
  4. Return the recorded X in the end.

Algorithm

1. Set left := largest number in nums, right := sum of all numbers in nums
2. while left <= right <- start binary search
     1. let X = (left + right) / 2
     2. left splits := number of splits needed for largest subarray sum = X
     3. if split <= m
          1. record X, update right := X-1
     4. else
          1. update left := X+1
3. return recorded X

Code Implementation

class Solution {
public:
    int getSplits (vector<int>& nums, int maxSum) {
        auto sum = 0;
        auto to_return = 0;
        
        for (auto& num : nums) {
            if (sum + num <= maxSum)
                sum += num;
            else {
                sum = num;
                to_return ++;
            }
        }
        
        return to_return + 1;
    }
    
    
    int splitArray(vector<int>& nums, int m) {
        auto sum = 0, max = INT_MIN;
        
        for (auto& num : nums) {
            sum += num;
            max = ::max(max, num);
        }
        
        auto left = max, right = sum;
        auto to_return = 0;
        
        while (left <= right) {
            auto to_check = (left + right) / 2;
            
            if (getSplits (nums, to_check) <= m) {
                right = to_check - 1;
                to_return = to_check;
            }
            else
                left = to_check + 1;
        }
        
        return to_return;
        
    }
};

Key Takeaways

  1. For DP problems, first, come up with a brute-force solution, then create a DP table for optimization. Try the bottom-up approach since it’s usually faster.
  2. Try to overlook the DP problem, and look for other solutions that could be faster, in this case, the binary search method.
  3. Re-create the DP table will give a sense of how entries on the table depend on each other, it makes the implementation of the bottom-up approach easier.
  4. The guessing method is sometimes a good way to solve problems, just adjust the guess at every computation, like quickselect!

Leetcode [Hard] – “Odd Even Jump” in C++

While I was doing some leetcode problems, one interesting question drew my attention. I spent quite some times on this problem, and ended up solving it with dynamic programming (dp).

This problem was particularly interesting and made me write a blog about it because it’s quite a representative dp problem and also uses functions of map that I wasn’t too familiar with, upper_bound and lower_bound. While I was working on this problem, I learned many of the awesome tricks and techniques for solving such kind of dp problems and acheive near-optimal efficiency.

Here is the description of the problem:

You are given an integer array arr. From some starting index, you can make a series of jumps. The (1st, 3rd, 5th, …) jumps in the series are called odd-numbered jumps, and the (2nd, 4th, 6th, …) jumps in the series are called even-numbered jumps. Note that the jumps are numbered, not the indices.

You may jump forward from index i to index j (with i < j) in the following way:

  • During odd-numbered jumps (i.e., jumps 1, 3, 5, …), you jump to the index j such that arr[i] <= arr[j] and arr[j] is the smallest possible value. If there are multiple such indices j, you can only jump to the smallest such index j.
  • During even-numbered jumps (i.e., jumps 2, 4, 6, …), you jump to the index j such that arr[i] >= arr[j] and arr[j] is the largest possible value. If there are multiple such indices j, you can only jump to the smallest such index j.
  • It may be the case that for some index i, there are no legal jumps.

A starting index is good if, starting from that index, you can reach the end of the array (index arr.length - 1) by jumping some number of times (possibly 0 or more than once).

Return the number of good starting indices.

Example 1:

Input: arr = [10,13,12,14,15]
Output: 2
Explanation: 
From starting index i = 0, we can make our 1st jump to i = 2 (since arr[2] is the smallest among arr[1], arr[2], arr[3], arr[4] that is greater or equal to arr[0]), then we cannot jump any more.
From starting index i = 1 and i = 2, we can make our 1st jump to i = 3, then we cannot jump any more.
From starting index i = 3, we can make our 1st jump to i = 4, so we have reached the end.
From starting index i = 4, we have reached the end already.
In total, there are 2 different starting indices i = 3 and i = 4, where we can reach the end with some number of
jumps.

At first glance, this problem is quite confusing because of the way that it explains itself, as well as some of the conditions aren’t told in the text. But a general sense for this problem is you start from some index i, and i is a good starting index if you can somehow reach the last element in the array by having a series of odd and even jumps. The condition for performing an odd and even jump is described above. We find and return the number of good starting index.

Observation

  1. Notice for every given input array with size > 0, if we set starting index to array.size()-1, in other words, last element, then it will always jump to the end since its already at the end. So last element is always a good starting index.
  2. Second to last element always depends on the last element since its first jump is a jump onto last element. This means that if the jump from second to last element is a valid jump, i.e., last element >= second to last element, then we already know its a good starting index since we know forsure last element can always reach to the end.
  3. An inductive result drived from the second observation tells us that as we go backward from the last element in the array, each jump it makes uses a result that we have already computed, thus, no need to redo all the unnecessary jumping again. And this is where Dynamic Programming comes into play.

Dynamic Programming

Dynamic programming is an algorithmic optimization for plain recursion. In many of the problem, we keep making recursive calls for same inputs, but wiith dynamic programming solution, results for some of the inputs are stored so that when they are being used again, we don’t have to re-compute them. To be honest, I am no expert in dynamic programming, it is very much of a shortcoming of mine, but I will post more interesting leetcode dp problem and their solutions to make myself more confortable with it.

class Solution {
public:
    int oddEvenJumps(vector<int>& A) {
        int size = A.size();
        int to_return = 1;
        
        vector<int> high(size);
        vector<int> low(size);
        
        high[size-1] = 1;
        low[size-1] = 1;
        
        map<int, int> m;
        m[A[size-1]] = size-1;
        
        for (auto i = size-2; i>=0; --i) {
            auto hi = m.lower_bound(A[i]), lo = m.upper_bound(A[i]);
            
            if (hi != m.end()) high[i] = low[hi->second];
            if (lo != m.begin()) low[i] = high[(--lo)->second];
            
            if (high[i]) to_return++;
            
            m[A[i]] = i;
        }
        
        return to_return;
    }
};

Above is the solution I have, the credit of the idea goes to https://leetcode.com/lee215.

On Line 5 I created the result and initialized it to 1 since we already know for any given array size > 0, last element is always a good starting index. Line 6 and line 7 I declared two vectors high and low of size equal to the size of array. They of type int since there is no primitive datatype boolean in C++, and the value at index j of both vectors represents a is A[j] a good starting index or not. It goes without being said, the last value of vector high and low is 1 (1 means good starting index, 0 means not) based on my first observation, which is what I did on line 10 and line 11.

Notice on line 13, I created a map and inserted the last element of the input array with its value as key and index as value in the following line. The reason why map is being used is that in C++, map stores its value as a binary tree structure and sorts its data based on its key value. And also I can used functions upper_bound and lower_bound. I will explain those those functions do later. Map will be used for checking is there a valid jump for a certain index.

In the forloop between line 16 and line 25 is where the magic happens. Starting from the second to last element (since check the last element doesn’t need to be checked), I find a hi and a lo. hi variable stores the key-value pair from map that is >= element we are checking, and lo stores the first element that goes after the element we are checking. Ok, so hi stores the next element we can jump to with a odd jump, but what the hell is lo? Well, interestingly, upper_bound doesn’t give you an element that is actually upper bounded by the given value for some reason, it gives the first element that is greater than the input value. For example, if we have an array [1, 2, 3, 4, 5], upper_bound(3) gives you 4 instead of 3, so in order to get the value I need, I have to decrement it to get the iterator that is <= the input A[i] (Line 20).

On line 22, to increment to_return if high[i], in other words, if the odd jump from index i is takes us to the end. Then finally on line 24, we insert a new key-value pair into map.

Takeaways

  • For problems that recursively re-computing the result from the same input, we can always use dp. In this case, we have two vectors high and low that records if each starting index is a good starting index or not.
  • Map is always a good data structure if we want to continuely adding key-value pair and have them sorted along the way since both insert and erase takes O(log(n)). And its upper_bound and lower_bound functions is perfect for this particular problem.
  • Process an array in reverse order can sometimes give better performance.